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1  Introduction 


The  purpose  of  the  Enterprise  Systems  Analysis  line  of  research  is  to  develop  and  evaluate  a 
methodology  for  modeling  and  analyzing  enterprise  systems.  We  define  an  enterprise  system 
as  a  set  of  interacting  organizations  that  serve  a  purpose  yet  have  no  locus  of  control.  Their 
behavior  is  often  complex  and  must  be  viewed  simultaneously  from  several  different 
perspectives  to  be  understood.  The  US  Department  of  Defense  (DoD)  faces  a  number  of 
challenges  where  there  are  multiple  interacting  organizations  with  no  central  locus  of  control. 
For  example: 

•  Combating  the  proliferation  of  counterfeit  parts  in  military  systems 

•  Managing  joint  and  international  acquisition  programs 

•  Coordinating  disaster  and  humanitarian  responses  involving  governments,  NGOs,  and  US 
agencies 

•  Sustaining  the  defense  supplier  base  in  the  face  of  declining  acquisition  quantities 

Consequently,  DoD  has  requested  research  to  enable  DoD  and  Government  policy  makers  to 
better  understand  these  enterprise  problems  and  shape  policy  appropriately.  More  specifically, 
any  enterprise  systems  analysis  methodology  should  enable: 

•  Representing  the  "as-is"  enterprise,  the  "to-be"  enterprise,  and  the  path  between  them 

•  Understanding  relationships  between  variables  and  techniques  for  projecting  outcomes 
and  performance 

•  Providing  a  means  for  experimentation  and  creation  of  response  surfaces  for  analysis  of 
key  tradeoffs 

•  Providing  a  systematic  method  to  search  for  policy  tipping  points  and  identify  counter¬ 
intuitive  results 

•  Creating  an  interactive  environment  for  discussion  and  debate  of  strategies,  policies  and 
plans 

•  Enabling  key  stakeholders  to  understand  the  implications  and  potential  second  order 
effects  of  policy  and  resource  decisions 

The  work  performed  during  this  research  task  (RT-161)  is  direct  follow-on  to  the  work 
performed  during  RT-138  and  RT-110.  The  outcome  from  the  prior  work  was  a  shift  in 
emphasis  away  from  building  a  unitary  enterprise  model  toward  a  core-peripheral  approach  in 
which  "peripheral"  models  could  be  added  or  removed  as  needed  to  generate  scenarios  of 
interest  to  enterprise  stakeholders.  Also  highlighted,  via  a  series  of  peer-reviews,  was  that  the 
methodology  needed  to  be  enhanced  to  better  detect  unintended  or  counter-intuitive  policy 
consequences  and  to  better  deal  with  multi-scale  ontologies.  Consequently,  the  major  tasks  for 
RT-161  were: 

1.  Apply  the  core-peripheral  approach  to  a  case  study  of  protecting  critical  infrastructure 
(Section  3) 

2.  Develop  and  validate  counter-intuitive  results,  secondary  effects,  and  policy  tipping 
points  (Sections  5  and  7) 
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3.  Extended  canonical  phenomena  and  model  reuse  methods  to  include  multi-scale 
ontologies  (Sections  4,  5,  and  6) 

4.  Update  the  enterprise  analysis  methods  to  incorporate  the  results  of  the  other  tasks 
(Section  8) 

From  the  execution  of  these  tasks,  we  were  able  to  develop  a  fairly  substantial  update  to  the 
enterprise  modeling  methodology.  More  specifically,  we  found  from  the  application  of  the 
core-peripheral  approach  to  the  critical  infrastructure  case  study  that  the  peer  reviewers  were 
interested  in  using  the  model  for  analysis  and  insight.  This  is  in  contrast  to  the  results  of  the 
counterfeit  parts  case  study  (RT-138,  RT-110)  where  the  peer  reviewers  tended  to  focus  on  the 
use  of  the  model  for  communication.  While  this  is  by  no  means  an  absolute  validation  of  the 
core-peripheral  approach,  it  is  an  encouraging  result.  Beyond  the  case  study,  a  theoretical 
investigation  yielded  insights  on  to  how  to  partition  an  enterprise  system  across  multi-scale 
ontologies  to  generate  the  core  and  peripheral  models  as  well  as  how  they  should  be  used 
together  to  detect  the  unintended  consequences  of  a  policy.  Ultimately,  this  lead  to  the 
revision  of  the  enterprise  modeling  methodology  that  reorganized  the  ten-steps  into  three 
major  phases.  Each  phase  contains  a  number  of  more  detailed  steps  that  should  provide 
additional  guidance  to  enterprise  analysts.  Finally,  we  also  identified  a  number  of  promising 
avenues  for  future  research  to  better  improve  the  efficacy  and  applicability  of  the  enterprise 
modeling  approach. 

The  remainder  of  this  report  is  organized  as  follows:  Section  2  briefly  reviews  the  findings  of  RT- 
138  and  RT-110  to  explain  and  motivate  the  work  performed  during  RT-161.  Section  3  presents 
the  results  of  applying  the  core-peripheral  approach  to  a  case  study  of  critical  infrastructure 
protection.  Section  4  summarized  the  results  of  an  Industry-Government  workshop  held  to 
discuss  the  challenge  of  model  centric-engineering  approaches  which  share  the  same  technical 
and  organizational  challenges  as  model-based  enterprise  analysis  approaches.  Section  5 
provides  a  detailed  literature  review  of  how  multi-scale  ontologies  are  modeled  and  how 
counter-intuitive  results  are  detected  in  both  the  physical  and  social  sciences.  With  regard  to 
the  multi-scale  ontology  aspects  of  the  problem.  Section  6  develops  a  detailed  mathematical 
analysis  of  the  problem  to  suggest  necessary  conditions  as  well  as  approaches  to  mitigate  the 
challenges  of  modeling  across  multiple  scales.  With  regard  to  detecting  counter-intuitive 
results,  unintended  consequences,  and  policy  tipping  points.  Section  7  develops  a  proposed 
approach  to  partitioning  a  multi-scale  ontology  into  core  and  peripheral  models.  These  models 
are  then  systematically  varied  to  generate  scenarios  that  may  identify  counter-intuitive  results. 
This  also  led  to  the  identification  of  a  hypothesized  approach  to  organize,  navigate,  and  select 
models  for  reuse.  Flowever,  much  additional  research  is  required  and  promising  directions  for 
future  research  are  identified.  Based  on  the  results  of  all  of  the  other  tasks.  Section  8  presents  a 
revised  and  enhanced  version  of  the  enterprise  modeling  methodology.  Finally,  Section  9 
concludes  the  report. 
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2  Implications  from  Prior  work 


The  research  approach  taken  in  RT-161  was  largely  driven  by  the  findings  of  the  previous 
research  tasks  RT-138  and  RT-110  (Pennock  et  al  2015,  Pennock  et  al  2016).  The  primary  of 
objective  of  those  studies  was  to  evaluate  and  refine  a  ten  step  modeling  methodology  for 
understanding  enterprise  systems  (Rouse  2015).  The  modeling  methodology  was  evaluating  by 
applying  it  to  a  case  study  of  counterfeit  electronic  part  intrusion  into  a  supply  chain. 

As  the  counterfeit  parts  model  was  presented  to  different  stakeholder  groups,  the  reception 
was  decidedly  mixed.  Some  felt  the  model  would  be  useful  to  explore  policy  options.  Others  felt 
that  the  model  told  them  what  they  already  knew.  Many  of  the  comments  and  observations 
were  familiar  to  anyone  who  has  been  involved  with  simulation  development:  concerns  about 
model  fidelity,  identification  of  additional  phenomena  that  could  be  added,  concerns  about 
data  availability,  concerns  about  predictive  accuracy,  etc. 

It  was  generally  recognized  that  a  model  of  an  enterprise  system  should  not  be  used  to  make 
specific,  quantitative  predictions.  Rather  the  interest  seemed  to  be  in  finding  counterintuitive 
results  or  unexpected  consequences.  However,  the  outputs  of  the  simulation  were  largely 
what  was  expected.  In  some  sense,  this  should  not  be  surprising.  Simulations  are  purely 
deductive,  and  thus,  the  conclusions  are  necessarily  entailed  by  the  premises.  This  does  not 
mean  that  one  never  obtains  unexpected  results  from  a  simulation,  but  when  one  attempts  to 
build  a  relatively  simple  and  interpretable  simulation  model  that  is  consistent  with  the  available 
data  and  validated  via  comparison  to  the  predictions  of  subject  matter  experts,  the  likely 
outcome  is  a  simulation  that  produces  exactly  what  the  subject  matter  experts  said  would 
happen.  This  is  somewhat  similar  to  testing  a  model  against  the  training  set  data.  Under  these 
circumstances,  any  unexpected  results  are  purely  incidental. 

Instead,  there  seemed  to  be  a  sense  that  the  simulation  provided  a  mechanism  to  both 
integrate  and  communicate  the  inputs  of  a  diverse  group  of  subject  matter  experts  to 
stakeholders  and  policy  makers.  Thus,  while  the  consequences  of  any  given  policy  option  may 
not  be  unexpected  for  some  of  the  subject  matter  experts,  they  may  be  unexpected  for  a 
subset  of  the  stakeholders.  As  a  result,  the  simulation  becomes  a  means  to  facilitate 
communication  and  discussion  as  well  as  rule  out  bad  policy  options  quickly. 

Interestingly,  the  issues  encountered  during  this  effort  may  not  necessarily  be  consequences  of 
the  ten-step  methodology  per  se  but  rather  the  reigning  paradigm  for  simulation  development 
in  engineering  and  the  hard  sciences.  Informally,  that  paradigm  can  be  described  as  follows: 
Build  a  simulation  that  faithfully  captures  the  structure  of  the  problem  and  can  reproduce  the 
available  data.  Such  an  approach  is  implicitly  designed  to  maximize  predictive  accuracy  ceteris 
paribus.  This  is  tantamount  to  trend  extrapolation.  However,  few  would  argue  that  simulations 
of  enterprise  systems  should  be  used  for  making  specific  quantitative  predictions.  So  what  are 
they  for?  Why  is  anyone  interested  in  them  at  all? 
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Based  on  our  case  study,  we  can  see  two  applications  that  are  not  entirely  consistent.  First,  the 
simulation  can  serve  as  a  means  to  integrate  and  communicate  data  and  expertise  from  diverse 
sources.  In  this  case,  the  knowledge  to  "extrapolate  the  trend"  exists,  but  it  is  scattered.  This 
knowledge  is  captured,  encoded,  and  integrated  via  the  simulation  development  effort.  Once 
this  is  accomplished,  stakeholders  and  decision  makers  can  explore  this  encoded  knowledge  in 
a  way  that  is  not  possible  via  multiple  separate  conversations  with  subject  matter  experts. 
Unexpected  or  counterintuitive  results  may  pop  out,  but  these  will  be  by-products  and  chances 
are  they  will  be  counterintuitive  to  some  but  not  all.  Thus,  the  simulation  serves  more  as  a 
thinking  aid  for  group  decision  making  as  opposed  to  a  means  to  discover  something  truly 
surprising. 

The  second  application  is  to  identify  counterintuitive  results  and  unintended  consequences  of 
policy  options.  Since  we  are  fairly  effective  at  trend  extrapolation,  the  goal  shifts  from 
reproducing  the  trend  to  trying  to  identify  what  might  cause  the  trend  to  change.  How  could 
our  well  intentioned,  well  thought  out  policy  go  wrong?  This  is  exactly  the  opposite  of  fitting  a 
model  to  data  or  subject  matter  expert  predictions.  Instead  we  want  to  understand  the 
feasibility  of  scenarios  that  we  have  not  experienced  or  run  against  conventional  wisdom.  In 
other  words,  we  are  not  just  interested  in  the  data.  This  suggests  a  very  different  way  to  go 
about  building  a  model. 

If  the  objective  is  really  to  identify  counterintuitive  results  and  unintended  consequences,  then 
the  reigning  paradigm  for  developing  simulations  in  engineering  and  the  hard  sciences  may  be 
suboptimal  for  this  purpose.  Instead,  we  could  take  a  page  from  the  field  of  risk  analysis.  We 
want  to  consider  how  we  could  make  a  policy  produce  unexpected  outcomes.  This  entails 
deliberately  exploring  variations  of  conventional  assumptions,  experimenting  with  alternative 
referential  ontologies  and  theories,  and  hunting  for  feedback  effects.  As  noted  by  Cardoso  and 
Pennock  (2016),  this  is  analogous  to  efforts  to  use  system  dynamics  to  identify  unintended 
consequences  in  policy  analysis.  The  difference  is  that  here  we  would  vary  more  than  just 
balancing  and  reinforcing  loops  as  we  are  intentionally  considering  various  ontologies  and 
scales. 

From  an  epistemological  standpoint,  we  have  no  guarantee  that  any  unexpected  results 
identified  can  or  will  happen.  Instead  they  simply  establish  the  possibility.  Once  these  are 
identified,  they  can  be  adjudicated  and  investigated  further.  To  put  it  succinctly,  rather  than 
trying  to  build  a  model  that  faithfully  reproduces  what  we  see,  it  should  be  giving  us  guidance 
as  to  where  to  look. 

One  could  argue  that  enterprise  modeling  methodology  evaluated  in  the  two  preceding  SERC 
tasks  is  a  product  of  the  reigning  paradigm.  Consequently,  it  is  more  suitable  for  the  first 
application  than  the  second.  This  may  explain,  in  part,  why  the  counterfeit  parts  simulation 
generated  more  interest  as  a  communication  tool  than  a  means  to  find  unexpected  policy 
consequences.  However,  the  methodology  seems  to  be  flexible  enough  to  accommodate  the 
second  application  as  well. 
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To  accommodate  the  idea  of  using  the  simulation  to  identify  unintended  consequences,  we 
modified  the  modeling  approach  based  on  the  lessons  learned  from  the  counterfeit  parts  case 
study.  This  modified  approach  was  then  evaluated  in  this  research  task  via  a  case  study  of 
critical  infrastructure  protection.  We  found  in  RT-138  that  the  using  a  multi-level  view  of  the 
enterprise  is  useful  for  conceptualizing  the  enterprise,  but  we  suspect  that  the  output  metrics 
of  interest  are  usually  the  direct  output  of  one  or  two  of  the  layers.  Thus,  it  makes  sense  to 
create  an  integrated  core  model  that  generates  the  values  of  these  output  metrics.  We  could 
consider  the  core  model  the  first  order  logic  that  governs  the  values  of  the  output  metrics. 

We  are  then  interested  in  searching  for  higher  order  effects  that  one  might  consider 
counterintuitive  results  or  unintended  effects.  The  natural  place  to  find  these  are  via 
interactions  with  the  other  layers.  However,  there  may  be  more  than  one  way  to  represent  the 
other  layers.  This  is  particularly  true  for  human  and  social  behaviors.  Returning  to  the 
counterfeiting  example,  should  we  model  counterfeiters  as  classical  utility  maximizers?  Should 
we  employ  prospect  theory?  Information  economics?  Each  approach  may  reveal  a  different 
insight.  More  importantly,  each  may  have  a  different  impact  on  the  behavior  of  the  core  model. 

Thus,  we  represent  the  non-core  layers  using  peripheral  models.  The  purpose  of  the  peripheral 
models  is  to  "perturb"  the  core  model  to  generate  useful  insights.  A  major  risk  to 
implementing  a  policy  option  in  an  enterprise  is  crossing  a  tipping  point  that  no  one  knew  was 
there.  The  peripheral  models  can  be  used  to  trigger  tipping  points  in  the  behavior  of  the  core 
model.  Finding  the  tipping  points  depends  on  exploring  structural  and  ontological  variations  of 
the  peripheral  models  (Pennock  &  Gaffney  2016). 

While  the  natural  tendency  in  enterprise  modeling  seems  to  be  to  maximize  predictive  accuracy 
by  maximizing  the  fidelity  of  the  model  (i.e.,  add  as  many  relevant  factors  as  possible),  this 
approach  has  rapidly  diminishing  returns  as  it  increases  the  degrees  of  freedom  and  risks  over¬ 
fit  with  sparse  data  (Pennock  &  Gaffney  2016).  Rather,  it  may  be  more  productive  to  build  a 
relatively  simple  core  model  and  then  selectively  perturb  it  with  structural  variations  in  the 
peripheral  models  to  see  if  this  triggers  any  unexpected  behaviors  (e.g.,  tipping  points). 

Evaluating  and  refining  this  core-peripheral  approach  to  detect  unintended  consequences  of  a 
policy  is  the  primary  objective  of  this  research  task.  The  remainder  of  this  report  documents 
those  efforts.  The  key  elements  were: 

•  A  case  study  of  protecting  critical  infrastructure  to  evaluate  the  mechanics  of  the  core¬ 
peripheral  approach  (Section  3) 

•  An  industry-government  workshop  to  understand  the  state  of  practice  in  model  centric 
engineering  (which  is  an  analogous  problem  to  using  multi-level  model  to  find 
unintended  consequences),  (Section  4) 

•  A  detailed  literature  review  of  how  multi-level  issues  are  handled  in  the  physical  and 
social  sciences  as  well  as  how  unintended  consequences  and  counterintuitive  results 
are  detected  (Section  5) 
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•  A  set-theory  based  analysis  of  the  mathematics  behind  developing  a  valid  multi-level 
model  (Section  6) 

•  Initial  development  of  an  approach  to  systematically  identify  unintended  consequences 
(Section  7) 

Ultimately,  the  results  of  these  efforts  led  us  to  propose  changes  to  the  ten-step  enterprise 
modeling  methodology  (Section  8). 


3  Critical  Infrastructure  Protection  Case  Study 


Case  studies  have  been  used  in  this  research  as  a  primary  method  to  aid  in  evaluation  of 
enterprise  modeling  methodologies.  Here,  we  discuss  a  case  study  model  involving  critical 
infrastructure. 


3.1  Background 

Critical  infrastructure  includes  such  systems  as  the  power  grid,  communications  networks, 
transportation  networks,  food  delivery  systems,  financial  systems,  emergency  response 
systems,  and  numerous  others.  These  systems,  as  the  name  implies,  have  become  essential  to 
the  operation  of  modern  society,  as  well  as  to  national  defense.  At  the  same  time,  critical 
infrastructure  systems  have  become  extensively  interconnected  and  networked.  While  there 
are  clear  benefits  to  the  functionality  of  infrastructure  from  this  interconnection,  the 
interdependencies  introduced  can  create  vulnerabilities  that  are  difficult  to  identify  and 
safeguard.  These  vulnerabilities  may  be  due  to  unintentional  failures  (e.g.,  faulty  or  aging 
components)  or  to  intentional  actions  (e.g.,  terrorism,  cyber-warfare,  etc.).  Once  a  failure 
occurs,  it  can  cause  cascading  failures  in  other  systems  and  infrastructures  due  to 
interconnections. 

With  the  increased  importance  of  infrastructure,  plus  a  number  of  high-impact  failures  in 
recent  years,  a  significant  body  of  research  has  studied  the  design,  behavior,  performance  and 
vulnerabilities  of  these  systems.  This  research  has  largely  focused  on  the  technical  aspects  of 
these  factors.  Like  many  complex  systems-of-systems,  though,  critical  infrastructure  operates 
in  an  enterprise  context.  That  is,  critical  infrastructure  is  not  a  monolithic  system,  but  different 
parts  of  these  infrastructure  systems  are  owned  and  operated  by  different  firms  or  agencies.  In 
addition,  regulatory  agencies  and  other  organizations  interact  to  influence  behavior  of  different 
actors.  This  collection  of  organizations  is  an  extended  enterprise  concerned  with  safe  and 
effective  operation  of  the  interconnected  infrastructure  systems. 

Critical  infrastructure  was  established  as  a  national  priority  in  the  1990s  with  a  number  of 
directives,  including  Presidential  Decision  Directive  NSC/63  (White  House,  1998).  This  directive 
established  a  public-private  partnership  for  managing  and  protecting  critical  infrastructure, 
effectively  an  enterprise  consisting  of  government  agencies  and  private  firms.  This  public- 
private  partnership  is  detailed  in  such  documents  as  the  National  Infrastructure  Protection  Plan 
(DHS,  2013). 
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Here,  we  explore  the  behavior  and  performance  of  critical  infrastructure  from  an  enterprise 
perspective.  In  particular,  we  are  interested  in  the  resilience  of  such  systems.  We  use  an 
enterprise  modeling  methodology  that  addresses  the  socio-technical  behavior  of  the  enterprise 
to  create  a  simulation  of  the  enterprise  and  use  this  simulation  to  study  the  effects  of  various 
policies  and  external  effects. 


3.2  Modeling  of  Critical  Infrastructure 

The  importance  and  complexity  of  the  problem  has  inspired  a  variety  of  research  efforts.  For 
instance,  Dehghani  and  Sherali  (2016)  develop  an  optimization  approach  for  scheduling 
maintenance  to  mitigate  disaster  impacts,  while  accounting  for  stochastic  system  behavior. 

Due  to  the  high  level  of  stochastic  behavior,  though,  most  research  has  focused  on  simulation. 
Additionally,  systems-of-systems  modeling  frameworks  have  been  introduced  as  a  way  to 
represent  interactions. 

Otto  et  al.  (2016)  specify  a  system-of-systems  framework  for  modeling  infrastructure  to  support 
long-term  simulation  of  these  systems.  Min  et  al.  (2007)  combine  IDEF  models,  system 
dynamics  and  non-linear  optimization  to  provide  capability  to  set  control  variables  to  minimize 
the  effect  of  disruptions.  Grogan  and  de  Week  (2015)  propose  a  systems-of-systems  modeling 
framework  to  support  simulation  of  infrastructure  systems. 

Several  in-depth  reviews  of  modeling  methodologies  and  applications  for  critical  infrastructure 
have  been  published  (Ouyang,  2014;  Pederson  et  al.,  2006;  Yusta  et  al.,  2011).  These  reviews 
highlight  the  role  of  agent-based  simulation  in  addressing  individual  decision-makers  and  the 
bottom-up  nature  of  many  infrastructure-related  phenomena,  plus  the  role  of  system  dynamics 
simulation  in  addressing  non-linear  phenomena  and  feedback  loops.  In  particular,  agent-based 
approaches  are  well-suited  to  modeling  enterprise  systems,  since  complex  agents  can  represent 
the  different  enterprise  actors  (firms,  agencies,  etc.).  Most  approaches  that  use  agent-based 
modeling,  however,  use  agents  for  individual  decision-makers  and  system  elements.  One 
exception  involves  a  large-scale  architecture  for  composing  models  of  different  infrastructure 
systems  for  different  analyses  with  a  focus  on  socio-technical  behavior  (Atkins  et  al.,  2008). 
Fujimoto  et  al.  (2016)  discuss  perspectives  on  applying  dynamic  data  driven  application  systems 
to  simulation  of  smart  cities  and  infrastructure  grids  whereby  system  data  drives  simulation 
(DDDAS)  computations  that  provide  dynamic  adaptation  to  improve  performance. 

Pederson  et  al.  (2006)  distinguish  between  single  models  versus  coupled  models.  Single  models 
combine  different  infrastructure  systems  into  one  model,  while  coupled  models  feature  a 
coupled  collection  of  models,  each  with  a  single  infrastructure.  Additionally,  some  models 
couple  with  earthquake  models  or  other  disaster  models  or  with  database  information  such  as 
GIS.  Large-scale  models  often  suffer  from  long  run  times.  Rosen  et  al.  (2016)  report  on  an 
approach  using  neural  network  metamodels  and  stochastic  krieging  metamodels  to  improve 
model  response  times  for  decision  support  in  critical  infrastructure  network  evaluation. 

Many  of  the  studies  above  use  system  availability  or  recovery  from  disruption  as  measures  of 
infrastructure  performance.  Increasingly,  though,  research  has  focused  on  resilience  as  a 
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performance  measure  for  infrastructure  systems,  with  definitions  metrics.  Hosseini  et  al. 
(2016)  review  various  resilience  definitions  and  metrics  and  distinguish  between  qualitative 
frameworks  and  quantitative  approaches.  Francis  and  Bekera  (2014)  propose  a  metric  based 
on  three  types  of  resilience  -  absorptive,  adaptive  and  restorative.  Absorptive  resilience  refers 
to  the  ability  of  a  system  to  absorb  a  shock  and  not  lose  performance  significantly.  Adaptive 
resilience  refers  to  the  ability  of  a  system  to  reconfigure  itself  to  minimize  the  impact  of  a 
shock.  Finally,  restorative  resilience  refers  to  the  ability  of  a  system  to  return  to  an  acceptable 
or  nominal  state  of  performance  quickly  after  a  shock. 

Similar  to  the  counterfeit  parts  case  study,  there  are  a  number  of  features  that  make  this  an 
enterprise  problem. 

•  There  is  no  locus  of  control. 

o  Each  sector  of  critical  infrastructure  is  overseen  and  regulated  by  a  different 
federal  agency.  For  instance,  the  Department  of  Energy  addresses  the  power 
grid.  Department  of  Flomeland  Security  oversees  the  communications 
infrastructure. 

o  Private  firms  manage  different  parts  of  the  various  infrastructure  networks. 

o  These  firms  typically  operate  at  the  state  level  rather  than  the  national  level  and 
are  regulated  by  state  agencies. 

•  There  is  significant  adaptive  behavior. 

o  Terrorists  may  adapt  to  different  strategies  to  protect  infrastructure. 

o  Populations  adapt  to  infrastructure  outages  and  potential  outages. 

•  There  is  significant  complexity. 

o  Clearly,  there  is  significant  socio-technical  behavior  from  the  market.  Socio- 
technical  behavior  is  inherently  complex. 

o  In  addition,  there  are  multiple  interconnected  infrastructure  systems  that 
interact,  with  sometimes  unpredictable  effects. 


3.3  Methodology 

In  recent  years,  there  has  been  growing  interest  in  modeling  and  analyzing  enterprise  systems. 
An  enterprise  is  a  collection  of  organizations  and  resources  that  cooperate  in  pursuit  of  some 
goal  or  mission  (Rouse,  2005).  To  address  improved  enterprise  performance,  a  variety  of 
research  efforts  have  created  methods  to  model  and  analyze  enterprise  systems  (Barjis,  2011; 
Gharajedaghi,  2011;  Giachetti,  2010;  Glazner,  2011),  with  a  focus  on  design  or  transformation 
of  the  enterprise. 

Our  primary  interest  has  been  on  generic  and  reusable  methods  for  modeling  enterprises.  One 
approach  to  modeling  the  variety  of  enterprise  phenomena  is  to  consider  different  levels  of 
enterprise  organization  and  behavior.  Enterprises  are  often  conceptualized  as  operating  at  a 
macro-level,  with  different  agencies,  firms  and  other  organizations  interacting  to  support  a 
common  goal  in  the  context  of  a  larger  economy.  Flowever,  they  also  operate  at  a  micro-level 
with  the  transactional  delivery  of  products  and  services  to  individual  consumers.  In  between 
these  two  levels,  enterprises  can  be  decomposed  into  a  number  of  elements  and  activities. 
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including  units  within  organizations,  supply  chains  supporting  the  transformation  of  raw 
materials  to  delivered  products  and  services,  and  workforces  that  perform  the  activities 
supporting  enterprise  goals.  Thus,  enterprises  can  be  represented  as  multi-level  systems. 


Such  a  multi-level  formalism  and  associated  modeling  methodology  are  proposed  by  Rouse 
(2015).  The  formalism  features  an  eco-system  level,  a  networked  inter-organizational  structure 
level,  an  operational  delivery  level,  and  a  work  practices  level.  The  methodology  then  has  ten 
steps,  starting  with  abstract  modeling,  then  moving  to  composition  of  multiple  modeling 
formalisms  needed  for  various  enterprise  phenomena,  and  finally  addressing  traditional  issues 
of  parameter  estimation,  model  building,  and  verification  and  validation. 

A  modified  version  of  this  methodology  was  proposed  by  Pennock  et  al.  (2017)  based  on  results 
of  our  previous  case  study  addressing  counterfeits  parts  in  the  DoD  supply  chain.  The  next  sub¬ 
sections  describe  the  application  of  this  revised  methodology  to  critical  infrastructure  via  a 
series  of  steps.  The  focus  is  on  the  first  series  of  steps  in  modeling  as  opposed  to  the  later 
stages  of  data  gathering,  model  implementation,  and  experimentation. 


3.3.1  Central  Questions  of  Interest 

The  first  step  of  the  methodology  is  to  decide  on  the  central  question(s)  of  interest.  This 
question  relates  to  the  intended  use  of  the  model.  In  an  enterprise  problem  context,  this  step 
also  incorporates  the  perspectives  of  multiple  stakeholders  and  potentially  multiple  uses.  Thus, 
it  may  not  be  as  obvious  as  for  a  model  of  a  purely  technical  system  with  one  or  two 
stakeholders 

The  model  is  intended  to  address  the  following  question:  what  is  the  best  mix(es)  of 
investments,  standards  and  policies  for  providing  long-term  value  in  terms  of  availability,  safety 
and  security  versus  cost. 


3.3.2  Key  Phenomena 

The  next  step  in  the  methodology  is  to  characterize  the  key  phenomena  that  should  be 
represented.  Based  on  the  literature  review  discussed  previously,  key  phenomena  are 
organized  into  several  different  categories  as  shown  below  in  Table  1. 


Table  1.  Key  phenomena  for  critical  infrastructure 


Category 

Phenomena  of  Interest 

Infrastructure 

systems 

•  Infrastructure  nodes 

•  Network  architecture  linking  node 

•  Service  delivery  between  nodes 

•  Redundancy,  hardness 

•  System  performance  criteria 

•  Maintenance  and  repair  schedules 
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Infrastructure 

•  Relationships  and  dependencies  from  one  infrastructure 

system  inter¬ 

system  to  another 

connections 

•  Effects  of  outages  in  one  infrastructure  system  to  another 

Enterprise  actors 

•  Firms  that  own  segments  of  infrastructure 

Policy 

•  Federal  agencies 

•  State  regulations 

•  Redundancy/hardness 

•  Foreign  ownership 

Exogenous  factors 

•  Technological  progress 

•  Threat  profiles 

Table  1  focuses  on  generic  elements  for  infrastructure  systems.  The  model  implemented  here 
addresses  the  electrical  grid,  water  delivery  systems,  and  the  internet  communications  grid. 
The  elements  above  are  therefore  specialized  for  purposes  of  representing  these  infrastructure 
systems.  For  instance,  the  service  provision  of  the  electrical  grid  is  power,  and  the  service 
provision  of  the  water  delivery  system  is  drinkable  water.  The  internet  provides 
communications  via  data  packets.  The  electrical  grid  consists  of  power  plants,  transmission 
lines,  substations,  distribution  lines,  and  demands.  The  water  delivery  system  consists  of 
sources,  reservoirs,  treatment  plants,  pipes,  and  demands.  The  internet  system  consists  of 
servers  and  network  cabling. 


3.3.3  Visualizations  of  Relationships  Among  Phenomena 

The  multi-level  modeling  construct  has  been  useful  in  conceptualizing  different  phenomena  and 
how  they  operate  at  different  levels.  The  four  levels  consist  of  the  eco-system,  the  system 
structure,  delivery  operations,  and  work  practices.  In  this  effort,  the  focus  is  largely  on  the 
system  structure  and  delivery  operations.  Work  practices  come  into  play  implicitly  when  repair 
and  maintenance  is  conducted  or  when  services  are  delivered,  but  these  are  not  modeled  in 
detail.  The  eco-system  influences  the  behavior  and  performance  of  the  infrastructure  systems, 
often  in  an  exogenous  manner.  Figure  1  shows  a  visualization  of  the  multi-level  model  for 
critical  infrastructure. 
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Maintenance  &  repair 

Figure  1.  Conceptual  enterprise  model  of  critical  infrastructure 


3.3.4  Key  Trade-offs  That  Appear  to  Warrant  Deeper  Exploration 

Next,  key  trade-offs  are  articulated  so  that  the  phenomena  underlying  them  can  be  included. 
The  following  trade-offs  were  identified  for  critical  infrastructure. 

•  Trade-off  between  resilience  and  cost  for  different  levels  of  redundancy  and  protection 
via  hardness; 


•  Trade-off  between  resilience  and  cost  for  different  strategies  of  upgrading  technologies 
and  standards; 


•  Trade-off  between  service  level  and  resilience  for  different  architectures  and 
interconnection  patterns. 


3.3.5  Alternative  Representations  of  these  Phenomena 

Simulation  modeling  provides  three  primary  paradigms  -  discrete-event  (DE),  agent-based  (AB) 
and  system  dynamics  (SD).  Discrete-event  models  focus  on  events,  processes  that  cause 
events,  and  new  events  triggered  by  executing  events.  Agent-based  models  focus  on  elements 
within  a  model,  how  they  react  to  messages  and  state  changes,  and  how  system  behavior 
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emerges  over  time  as  a  result  of  individual  element  behaviors.  System  dynamics  models 
address  rates  of  change,  interdependencies,  feedback  loops  and  lags  in  system  behavior.  Table 
2  shows  alternative  modeling  representations  for  the  categories  of  enterprise  elements  in  Table 
1. 


Table  2.  Alternative  modeling  representations 


Category 

Representation  alternatives 

Infrastructure 

systems 

•  Agent-based  -  AB  models  provide  support  for  state 
transitions  to  model  the  node  state  behavior  and  inheritance 
to  model  different  types  of  nodes  with  commonalities,  plus 
message-passing  between  nodes. 

•  Discrete-event  -  DE  models  provide  support  for  discrete 

elements  moving  through  processes  representing 
infrastructure  networks.  There  is  limited  support  for 

inheritance  and  message-passing  (except  through  signal- 
hold). 

•  System  dynamics  -  SD  models  provide  support  for  continuous 
flows  found  in  many  infrastructure  systems. 

•  Agent-based  and  system  dynamics  models  are  preferred  due 
to  their  complementary  support  for  state-based  behavior  and 
continuous  flow. 

Infrastructure 
system  inter¬ 

connections 

•  Agent-based  -  AB  models  support  message-passing  between 
different  infrastructure  systems. 

•  Discrete-event  -  DE  models  support  discrete  elements 
transitioning  between  infrastructure  systems  and  signal-hold 
for  message-passing. 

•  System  dynamics  -  SD  models  support  continuous  flow 
between  infrastructure  systems. 

•  Agent-based  models  are  preferred  due  to  the  flexibility  of 
their  message-passing  capability  over  discrete-event  models, 
and  due  to  the  discrete  nature  of  on-off  relationships  that 
exist  for  service  provision  between  infrastructure  systems. 

Enterprise  actors 

•  Agent-based  -  AB  models  have  been  used  extensively  to 
model  interactions  of  individual  units,  as  well  as  adaptive 
behavior.  In  addition,  there  is  potential  to  embed  micro- 
economic  models  in  agents. 

•  Discrete-event  -  DE  models  are  not  typically  used  for 
enterprise  actor  models. 

•  System  dynamics  -  SD  models  are  not  typically  used  for 
enterprise  actor  models. 

•  Agent-based  models  are  preferred  due  to  their  extensive  use 
in  modeling  individual  unit  interactions,  adaptive  behavior 
and  economic  behavior. 
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Policy 

•  Agent-based  -  AB  models  have  been  used  extensively  to 
model  interactions  of  individual  units,  as  well  as  adaptive 
behavior.  This  could  extend  to  policy  units. 

•  Discrete-event  -  DE  models  are  not  widely  used  for  policy 
models  as  are  AB  and  SD  models. 

•  System  dynamics  -  SD  models  have  seen  extensive  use  for 
policy  study,  with  the  concept  of  variables  being  used  as 
policy  levers  having  resulting  interaction  effects. 

•  Agent-based  models  are  preferred  do  to  the  ability  to 
represent  adaptive  behavior  of  policy-makers.  Policies  would 
be  represented  as  variables,  with  the  policy  effects 
embedded  in  other  sub-models. 

Exogenous 

environment 

•  Agent-based  -  AB  models  support  exogenous  elements  such 
as  organizations/actors. 

•  Discrete-event  -  DE  models  support  exogenous  elements 
involving  process  behavior. 

•  System  dynamics  -  SD  models  can  be  used  to  aggregate 
behaviors  and  incorporate  feedback  loops,  lags,  etc.  SD 
models  have  been  used  for  macro-economic  phenomena. 

•  Agent-based  and  system  dynamics  models  are  preferred  due 
to,  respectively,  their  representation  of  organizations/actors, 
and  their  representation  of  aggregate  effects  not  requiring 
detail. 

Table  3  presents  descriptions  of  the  representations  selected  for  different  categories  of 
phenomena  being  modeled. 


Table  3.  Selected  representations 


Category 

Phenomena  of  Interest 

Infrastructure 

systems 

Agent-based  model  for  nodes  and  arcs.  System  dynamics  models 
and  similar  continuous  flow  models  embedded  into  agents  for 
service  flow. 

Infrastructure 
system  inter¬ 

connections 

Agent-based  network  of  connections  with  message-passing  for 
state-change  notifications. 

Enterprise  actors 

Agent-based  model  with  actors  modeled  as  complex  agents  and 
relationships  modeled  by  message-passing. 

Policy 

Global  variables  set  by  analyst  with  associated  agent-based 
policy  actors  to  enable  policy  adaptation. 

Exogenous 

environment 

Agent-based  and  system  dynamics  models  representing  trends 
in  technology  progress,  technology  off-shoring. 

Report  No.  SERC- 2017- TR- 106 


13 


Date  April  30,  2017 


3.3.6  Ability  to  Connect  Alternative  Representations 


The  representations  in  Table  3  consist  of  agent-based  and  system  dynamics  representations. 
These  two  paradigms  can  interoperate  via  such  simulation  platforms  as  AnyLogic™,  where  both 
formalisms  are  supported  in  underlying  Java™. 

The  key  is  to  design  the  interaction  so  that  it  is  computationally  efficient  and  scalable.  For 
instance,  such  interactions  can  occur  via  condition-checking  or  by  message-passing.  With 
condition  checking,  a  variable  is  monitored  continually,  and  when  it  reaches  a  threshold,  an 
event  is  triggered.  This  can  be  computationally  intensive  if  there  are  numerous  such  variables 
being  monitored.  Thus,  message-passing  is  typically  preferred. 


3.3.7  Core  Models  and  Peripheral  Models 

Our  approach  uses  a  "core-peripheral"  method  to  construct  the  overall  model,  similar  to  the 
approach  used  in  the  counterfeit  parts  enterprise  model.  The  core  model  consists  of  the  set  of 
phenomena  that  are  central  to  the  enterprise.  Peripheral  models  are  developed  to  support 
specific  analyses  of  interest.  For  instance,  in  the  counterfeit  parts  model,  the  core  model 
consists  of  the  defense  supply  chain,  the  systems  and  constituent  elements  supported  by  the 
supply  chain  through  manufacturing  and  sustainment,  and  the  enterprise  actors  that  own  and 
manage  different  parts  of  the  supply  chain.  One  peripheral  model  addresses  the  recycling  of 
electronic  waste.  Often,  this  waste  is  exported  to  third-world  nations,  and  some  of  it  is 
processed  into  fraudulent  counterfeit  electronics  that  are  imported  into  the  U.S.  The 
peripheral  model  addresses  the  behavior  of  the  recycling  market  when  export  restrictions  are 
put  in  place. 

Flere,  the  core  model  consists  of  the  different  infrastructure  systems  and  their  inter¬ 
connections,  plus  the  set  of  enterprise  actors  and  policy  actors  that  interact  with  the 
infrastructure  systems.  This  core  model  can  be  considered  as  the  "steady-state"  representation 
of  the  infrastructure  systems.  The  peripheral  models,  on  the  other  hand,  represent  disruptive 
factors  such  as  terrorism  or  a  natural  disaster.  It  is  the  effect  of  these  peripheral  models  on  the 
core  model  that  is  of  interest  (as  well  as  what  protections  and  recovery  mechanisms  are 
represented  in  the  core  model).  This  is  somewhat  different  than  the  approach  taken  in  the 
counterfeit  parts  model,  since  the  disruptive  forces  (i.e.,  counterfeiters)  are  part  of  the  core 
model,  being  part  of  the  supply  chain. 

Figure  2  shows  the  model  architecture  with  the  core  model  and  various  peripheral  models. 
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Peripheral  Model  1 

Peripheral  Model  2 

Peripheral  Model  3 

Peripheral  Model  4 

Figure  2.  Model  architecture 

The  next  section  provides  details  on  the  model  implementation. 


3.4  Model  Description 

A  prototype  enterprise  simulation  for  critical  infrastructure  has  been  implemented  using 
AnyLogic®  7.  AnyLogic  is  a  commercial  simulation  software  package  that  provides  capability  for 
multi-method  modeling  of  complex  systems  using  agent-based,  discrete-event  and  system 
dynamics.  As  such,  it  is  useful  for  enterprise  modeling. 


3.4.1  Infrastructure  Systems 

Three  different  infrastructure  systems  are  modeled  -  the  electrical  grid,  the  water  delivery 
system,  and  the  internet  portion  of  the  communications  system. 

The  electrical  grid  sub-model  starts  with  a  set  of  power  plants.  These  can  be  based  on  coal,  gas, 
oil/gas,  or  nuclear  power.  They  provide  power  to  a  set  of  transmission  sub-stations  via  high- 
voltage  transmission  lines.  High-voltage  enables  less  current  and  power  loss  due  to  resistance 
over  the  long  distances  of  power  transmission.  These  sub-stations  are  nodes  in  the 
transmission  network.  Sub-stations  can  transmit  power  to  other  transmission  sub-stations, 
depending  on  the  layout  of  the  transmission  network.  Eventually,  a  transmission  sub-station 
will  link  via  a  transmission  line  to  a  distribution  sub-station.  Distribution  sub-stations  reduce 
the  voltage  of  the  power  transmission  via  transformers  so  that  it  can  be  supplied  to  customers. 
A  set  of  distribution  lines  then  distributes  power  to  industrial  and  residential  demand  sources. 

The  power  plants  are  implemented  as  agents  having  a  set  of  outbound  transmission  lines  and  a 
certain  megawatt  rating.  In  addition,  they  have  state-based  behavior  as  shown  in  Figure  3. 
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Figure  3.  Power  plant  availability  cycle 

The  transmission  and  distribution  lines  are  also  implemented  as  agents.  These  agents  have 
simple  system  dynamics  models  embedded  to  represent  the  continuous  flow  of  current  and 
power  through  the  line.  There  are  two  variants  of  transmission  and  distribution  lines.  In  the 
simplified  version,  the  concern  is  whether  power  is  supplied  or  not  to  the  line.  The  second  is 
more  detailed,  and  it  contains  a  direct  current  representation  of  power  flowing  through  the 
distribution  network.  This  is  an  approximation  for  the  alternating  current  power  grid.  In  this 
variant,  electrical  concepts  modeled  include  voltages,  resistance  and  power  loss.  This  is 
depicted  in  Figure  4. 


- Q  dernandPower 


ipputCurrent  currentStock  'V  outputCurrent 


Figure  4.  Power  line  using  DC  transmission  model 
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A  demand  source  has  a  number  of  businesses  or  residences  that  it  serves,  plus  a  state-based 
demand  model  that  changes  the  level  of  demand  during  the  day.  If  an  upstream  system 
element  fails,  power  is  turned  off  to  downstream  elements.  Once  the  failure  is  resolved,  the 
downstream  elements  receive  power  again. 

The  water  delivery  system  starts  with  water  sources  and  reservoirs.  Pipes  connect  these 
sources  and  reservoirs  to  treatment  plants.  Once  water  is  treated,  it  is  supplied  to  water  tanks 
via  pipes.  These  tanks  then  supply  water  to  industrial  and  residential  customers.  The  water 
delivery  system  is  modeled  using  the  AnyLogic  fluid  library.  This  library  offers  continuous  flow 
constructs  similar  to  stocks  and  flows  in  systems  dynamics.  Similar  to  the  electrical  grid  sub¬ 
model,  these  fluid  elements  are  embedded  into  agent  objects  to  provide  state-based  behaviors, 
plus  encapsulated  variables. 

Water  sources  are  modeled  using  fluid  sources  embedded  into  water  source  agents.  Reservoirs 
and  water  tanks  are  modeled  using  tanks  that  are  embedded  into  reservoir  and  water  tank 
agents.  Tanks  are  temporary  storage  elements  in  the  fluid  library.  Pipes  are  modeled  using 
pipeline  elements  encapsulated  in  pipe  agents.  Demand  sources  are  modeled  as  agents  with 
embedded  fluid-dispose  elements.  The  demand  sources  have  a  demand  variable,  plus  a 
variable  to  denote  either  the  number  of  residences  or  businesses  served. 

The  two  agents  that  can  fail  in  the  water  delivery  system  sub-model  are  treatment  plants  and 
pipes.  We  assume  that  water  sources,  reservoirs,  and  tanks  do  not  fail.  If  a  water  treatment 
plant  or  a  pipe  fails,  downstream  elements  will  receive  fluid  temporarily,  but  then  will 
eventually  run  out  until  water  service  is  restored  by  a  fix  to  the  failed  element. 

In  addition  to  failures,  water  can  be  contaminated.  If  this  occurs,  it  is  assumed  that  a  particular 
reservoir  or  tank  is  contaminated.  The  tank  or  reservoir  is  unavailable  until  a  remediation 
process  fixes  the  contamination  problem. 

The  final  infrastructure  system  is  the  internet.  The  internet  operates  in  a  tiered  fashion  with 
major  telecommunications  companies  operating  at  the  top  level  (Tier  1)  with  data  exchange 
between  their  networks.  Tier  2  internet  service  providers  link  directly  to  this  network  and 
provide  service  to  their  customers.  A  Tier  2  network  operates  between  the  Tier  1  network  and 
the  smaller  Tier  3  ISPs.  A  Tier  3  ISP  may  be  single-sourced  or  multi-sourced  in  terms  of  its 
connections  to  the  Tier  2  network  elements.  A  large  company  or  organization  is  considered  a 
Tier  3  ISP  in  that  it  connects  to  a  Tier  2  network  element  and  provides  its  own  internal  networks 
and  services. 

Tier  1  telecommunications  providers  and  Tier  2/3  ISPs  are  modeled  as  agents.  These  operate  as 
nodes  in  the  internet  network.  They  have  server  agents  that  provide  processing  capacity  for 
transmission  of  packets  that  comprise  internet  traffic.  They  are  connected  via  arc  agents  that 
model  packet  transmission.  Servers  and  arcs  have  state  behavior,  and  they  are  either  in  an 
"available"  state  or  in  a  "failed"  state. 
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The  various  infrastructure  networks  are  populated  via  a  relational  database  that  contains  the 
node-link  relationships.  This  database  is  read  by  the  simulation  model  at  start-up  to  initialize 
the  infrastructure  systems. 


3.4.2  Infrastructure  System  Interconnections 

These  three  infrastructures  have  several  different  types  of  interconnections  and  dependencies 
as  summarized  below. 

•  The  electrical  grid  supplies  power  to  water  treatment  plants.  Such  plants  are  industrial 
customers. 

•  The  electrical  grid  supplies  power  to  servers  in  various  parts  of  the  internet  architecture. 

•  The  internet  supplies  real-time  information  to  the  electrical  system.  Without  such  real¬ 
time  information,  response  times  may  take  longer. 

•  The  internet  supplies  real-time  information  to  the  electrical  system.  Without  such  real¬ 
time  information,  response  times  may  take  longer. 


3.4.3  Enterprise  Actors 

The  enterprise  actors  consist  of  the  various  actors  in  the  infrastructure  system  infrastructure 
that  provide  services.  They  are  modeled  as  decision-making  agents.  There  are  four  types  of 
enterprise  actors  in  the  current  model: 

•  Power  providers 

•  Water  providers 

•  Telecommunications  firms 

•  ISPs 

The  enterprise  actors  implement  policy  directives  for  their  segments  of  infrastructure  systems. 
This  implementation  takes  time,  and  it  is  influenced  by  the  provision  of  subsidies  and 
restrictions  on  foreign  ownership  of  firms  that  may  perform  upgrades  to  meet  directives. 


3.4.4  Policy 

Policy  actors  consist  of  those  federal  agencies  that  oversee  the  different  infrastructure  systems. 
These  are  modeled  as  complex  agents  that  issue  policy  directives.  Policies  currently  modeled 
include  the  following: 

•  Restrictions  on  foreign  ownership  of  infrastructure-related  firms  (including  contractor 
firms  that  perform  upgrades,  etc.). 
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•  Redundancy  requirements  for  certain  infrastructure  elements 

•  Hardness  requirements  for  certain  infrastructure  elements 

•  Subsidies  for  infrastructure  hardness  or  redundancy 

At  present,  state  regulatory  agencies  are  not  modeled.  However,  they  could  be  added  in  the 
future  if  regulation  is  of  interest. 


3.4.5  Peripheral  Models 

Currently,  two  peripheral  models  are  implemented. 

•  Terrorism  -  In  this  sub-model,  segments  of  infrastructure  systems  are  targeted  for 
outages.  The  success  of  these  outages  depends  on  the  hardness  and  redundancy  of  the 
segment  targeted.  Terrorists  have  limited  knowledge  of  hardened  or  redundant  assets, 
and  thus  they  seek  to  target  assets  that  have  limited  hardness  or  redundancy. 

•  Natural  disaster  -  In  this  sub-model,  a  natural  disaster  is  represented  as  a  failure  in 
multiple  infrastructure  systems  within  a  geographic  area.  This  could  be  due  to  an 
earthquake  or  a  flood,  for  example. 


3.4.6  Performance 

The  model  tracks  two  primary  performance  measures. 

•  Resilience:  Resilience  is  the  ability  of  an  infrastructure  system  to  avoid  reductions  in 
service  delivery  or  recover  from  service  delivery  problems  due  to  a  disruption. 

Currently,  resilience  is  measured  as  simply  the  system  availability  relative  to  its  capacity 
over  time.  As  the  model  is  matured,  other  more  sophisticated  measures  will  be 
introduced  such  as  the  metric  in  Francis  and  Bekera  (2014). 

•  Cost:  Accrued  cost  over  time  is  tracked  to  determine  the  expense  associated  with 
different  policies.  Cost  can  be  considered  as  multi-dimensional  similar  to  service  level. 


3.4.7  Using  Coupled  Models 

The  core  model  has  been  developed  as  an  integrated  model  of  different  infrastructure  systems. 
In  scaling  up  this  modeling  approach,  it  may  be  necessary  to  compose  different  existing  models 
of  different  infrastructure  systems.  Pederson  et  al  (2006)  discusses  examples  of  such  model 
compositions.  In  this  section,  we  briefly  discuss  model  composition  issues  from  the  perspective 
of  the  current  model,  assuming  that  different  infrastructure  systems  are  modeled  separately. 

In  this  model,  the  interactions  between  different  infrastructure  systems  are  based  on  services 
provided  from  one  infrastructure  system  to  another.  For  instance,  the  electrical  grid  model 
provides  power  service  to  the  water  delivery  system  model,  namely  to  water  treatment  plants. 
A  state  change  in  the  electrical  grid  model  may  result  in  a  power  outage.  This  power  outage 
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may  cause  power  loss  to  a  water  treatment  plant.  In  the  integrated  model,  the  water 
treatment  plant  is  a  power  demand  source  in  addition  to  being  a  water  treatment  plant.  The 
segment  of  the  electrical  grind  that  experiences  a  failure  sends  an  outage  message  that  is 
propagated  to  downstream  elements  of  the  grid.  The  power  demands  receive  this  message 
and  can  adjust  their  state  to  power-off.  Similarly,  when  power  is  restored,  a  message  is  sent  to 
downstream  elements  notifying 

Assuming  two  composed  models  (electrical  grid  and  water  delivery  system),  the  state  of  the 
electrical  grid  model  is  input  into  the  water  system  model.  The  linkage  between  the  two 
models  is  therefore  a  state-linked  relationship  (See  Section  6).  Since  outages  are  not  a 
continuous  occurrence,  the  computational  burden  associated  with  managing  the  state-linked 
relationships  between  this  pair  of  composed  infrastructure  models  is  likely  manageable.  As  the 
number  of  infrastructure  models  included  increases,  this  would  scale  with  the  number  of  pairs 
with  dependencies  (n),  the  number  of  dependencies  in  each  system  pair  (m),  and  the  average 
frequency  of  state-change  (/)  to  be  O(nmf).  Dependencies  are  assumed  to  be  one-way  here. 

Note  that  the  current  model  has  a  discrete  linkage  relationship.  If  the  input-output 
relationships  are  based  on  the  values  of  continuously  changing  variables,  the  computational 
burden  would  increase  due  to  increased/. 

The  power  demands  not  associated  with  other  infrastructure  systems  are  modeled  in 
aggregate.  That  is,  collections  of  residences  or  businesses  are  aggregated  into  one  demand 
node.  This  is  also  true  of  water  delivery  system  demands.  Thus,  for  demands  at  different 
times,  we  would  use  data  from  each  individual  infrastructure  system  to  model,  for  instance, 
demand  in  the  morning  versus  demand  at  night.  If  we  increase  the  granularity  of  the  model, 
though,  so  that  a  demand  represents  an  individual  residence  or  business,  it  is  no  longer  the 
case  that  independent  datasets  can  be  used.  For  example,  one  particular  residence  may  not 
follow  the  aggregate  demand  functions  due  to  telecommuting.  Thus,  water  usage  and 
electricity  demand  would  depend  on  one  another,  and  the  power  demand  node  in  the  electrical 
grid  model  would  be  linked  to  the  corresponding  water  demand  node  in  the  water  delivery 
system.  To  account  for  the  individualized  behavior  of  a  residence  or  business,  the  existing 
composition  would  have  transition-linked  relationships.  Since  these  linkages  are  known,  they 
are  explicit  transition-linkages. 

To  remediate  them,  constraints  may  be  put  in  place.  For  example,  a  variable  set  can  be  added 
to  one  of  the  demand  nodes  indicating  the  type  of  at-home  behavior  of  that  node.  This  variable 
set  then  influences  state  changes  in  the  demand  for  its  node  directly,  and  it  can  be  used  in  a 
state-linked  transition  to  influence  the  demand  in  the  corresponding  node  in  the  other 
infrastructure  model. 
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3.5  Evaluation 


The  model  and  enterprise  modeling  approach  were  presented  to  a  group  of  subject  matter 
experts  involved  in  the  INCOSE  Critical  Infrastructure  Protection  and  Recovery  (CIPR)  Working 
Group.  An  initial  overview  was  presented  to  the  CIPR-WG  at  the  2017  INCOSE  International 
Workshop.  A  second  presentation  was  made  to  the  CIPR-WG's  monthly  meeting  on  February 
16. 

Comments  and  discussion  from  this  second  session  were  captured  and  are  organized  along  the 
following  lines: 

1.  Validity  —  the  extent  to  which  the  simulation  is  technically  correct  relative  to  the 
purposes  for  which  it  was  developed. 

2.  Acceptability  —  the  extent  to  which  the  simulation  addresses  problems  in  ways  that  are 
compatible  with  current  preferred  ways  of  decision-making  and/or  potentially  useful 
new  ways  of  multi-stakeholder  decision-making. 

3.  Viability  —  the  extent  to  which  use  of  the  simulation  for  the  purposes  intended  would 
be  worth  the  time  and  effort  required. 


Validity 

•  What  data  sources  are  being  used?  Most  data  for  the  various  infrastructure  sectors  is 
sensitive  or  proprietary. 

o  The  data  is  synthetic  in  the  model  currently  due  to  this  issue. 

o  It  would  be  desirable  to  have  synthetic  datasets  that  were  validated  as  being 
"representative"  of  actual  datasets  for  purposes  of  public  analysis. 

•  Can  this  approach  be  used  to  model  micro-grids?  There  are  some  opportunities  to 
model  these  types  of  systems,  which  would  be  on  a  smaller  scale  and  may  provide  some 
validation. 

o  This  approach  should  work  for  micro-grids. 

•  It  would  be  helpful  to  see  a  detailed  walk-through  of  the  model,  the  various  parameters, 
and  the  interactions. 

•  This  could  connect  to  work  being  done  in  model-based  systems  engineering  (MBSE)  and 
in  patterns. 

•  Resilience  has  many  different  definitions.  The  usage  of  absorptive,  adaptive  and 
restorative  is  interesting. 

•  It  would  be  of  interest  to  incorporate  major  disruptive  events  such  as  solar  flares  or 
electromagnetic  pulse  bombs  into  the  types  of  phenomena  modeled. 
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Acceptability 


•  There  was  general  agreement  that  this  type  of  modeling  approach  would  be  useful  for 
addressing  some  of  the  issues  that  the  CIPR-WG  community  has  tasked. 

•  It  would  be  useful  to  extend  the  model  to  additional  infrastructure  sectors. 


•  We  need  to  build  up  a  modeling  community  for  critical  infrastructure. 

Viability 

•  Several  individuals  expressed  interested  for  this  as  a  focus  area  within  CIPR-WG.  This 
could  be  foundational  to  creating  a  community  that  could  sustain  this  type  of  modeling. 

•  What  level  of  effort  is  involved  in  creating  a  model  with  many  different  interacting 
infrastructures? 


Overall,  the  discussion  and  feedback  focused  less  on  using  such  a  model  to  engage  different 
infrastructure  communities  from  different  perspectives  to  provide  a  platform  for 
communication  and  perspective-sharing  and  more  on  the  use  of  such  models  for  analysis  and 
insight.  This  is  in  contrast  to  the  subject  matter  expert  review  of  the  counterfeit  parts  case 
study.  Most  likely,  this  is  a  function  of  the  INCOSE  working  group  in  question,  in  that  it  focuses 
on  across-domain  work  in  critical  infrastructure  and  thus  does  not  require  as  much  in  terms  of 
communication  tools  for  different  perspectives. 

4  Summary  of  Industry-Government  Forum  on  Model  Centric  Engineering 


Real  world  experience  is  a  critical  component  of  developing  methods  and  approaches  that  can 
be  transitioned  to  practice.  With  regard  to  this  effort,  relevant  real  world  experience  would 
need  to  involve  both  multi-level/multi-scale  modeling  as  well  as  detection  of  unintended 
consequences  that  result  from  the  interactions  of  these  multiple  views  of  the  system.  One  area 
where  practitioners  are  addressing  these  challenges  is  Model  Centric  Engineering  (MCE).  The 
goal  of  MCE  is  to  use  computer  modeling  and  simulation  to  capture  and  manage  every  aspect 
of  the  engineering  process  from  requirements  development  to  design  to  sustainment.  The  goal 
is  to  use  simulation  to  detect  potential  issues  much  earlier  in  the  system  lifecycle  to  avoid  costly 
fixes  and  workarounds  downstream.  Necessarily  this  means  computationally  representing  the 
system  from  multiple  perspectives  and  tracing  the  consequences  of  decisions  in  one 
perspective  to  consequences  in  the  others.  From  a  technical  standpoint,  the  problem  is  very 
similar  to  that  of  detecting  unintended  policy  consequences  in  an  enterprise  system.  The  chief 
distinction  is  that  behavioral  social  factors  tend  to  play  a  larger  role  in  enterprise  systems  than 
engineered  systems.  As  we  will  see  in  Section  5.2,  dealing  with  behavioral  and  social  factors  in  a 
multi-level  model  is  a  substantial  challenge. 
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To  gain  an  understanding  of  the  state  of  practice  in  MCE  an  industry-government  workshop  was 
held  in  Washington,  DC  on  May  26,  2016.  Participants  in  the  workshop  included: 

•  15  faculty  members  from  across  the  collaborating  SERC  universities 

•  35  technical  leaders  from  Industry 

•  25  technical  leaders  from  the  government 

The  detailed  results  of  this  workshop  were  provided  in  a  separate  report  to  the  Government. 
Consequently,  the  results  will  only  be  summarized  here. 

While  there  was  certainly  discussion  of  the  technical  issues  associated  with  developing  and 
integrating  the  computational  models  to  support  MCE,  interestingly,  much  of  the  discussion 
revolved  around  behavioral,  cultural,  social,  and  organizational  impediments  to 
implementation.  In  short,  while  not  stated  this  way  by  the  participants,  they  felt  implementing 
MCE  is  largely  an  enterprise  problem.  Existing  business  practices  and  cultural  norms  make  it 
nearly  impossible  to  implement  MCE  even  if  the  technical  challenges  are  overcome.  So  while 
the  technical  challenges  were  recognized,  at  least  the  from  the  researchers'  perspective,  they 
were  largely  neglected  in  the  MCE  forum.  Rather,  there  seemed  to  be  an  acknowledgement 
that  how  one  integrates  and  validates  a  multi-level  simulation  to  support  MCE  is  an  open 
research  question.  There  currently  exists  no  systematic  approach  to  accomplishing  this.  Existing 
efforts  have  largely  been  implemented  on  a  case-by-case  basis. 

So  while  the  workshop  produced  limited  technical  insights  on  how  to  accomplish  the  detection 
of  unintended  consequences  computationally,  it  certainly  validated  the  research  question.  In 
fact,  it  revealed  that  the  implementation  of  MCE  itself  is  an  enterprise  problem  that  requires 
analysis.  However,  without  techniques  from  practitioners  to  consider,  the  importance  of 
understanding  the  work  of  those  using  multi-level  or  multi-scale  models  in  the  physical  and 
social  sciences  became  critical  to  the  research  effort.  The  results  of  that  investigation  are 
presented  in  the  following  section. 

5  Literature  Review 


In  order  to  develop  of  a  systematic  approach  to  detecting  unintended  policy  consequences  in 
an  enterprise  system  using  the  proposed  core-peripheral  approach,  several  issues  must  be 
addressed.  First,  the  peripheral  models  are  often  going  to  be  described  using  an  ontology  that  is 
nominally  incompatible  with  the  core  model  because  it  uses  a  different  abstraction  or  different 
scale.  Thus,  one  is  concerned  with  how  to  handle  multi-scale  ontologies.  Second,  enterprise 
systems  contain  substantial  behavioral  and  social  components.  It  is  likely  that  one  or  more  of 
the  peripheral  models  will  draw  from  the  social  sciences.  Historically,  models  in  the  social 
sciences  exhibit  both  greater  variance  and  more  instability  than  those  from  the  physical 
sciences.  Understanding  how  these  issues  are  handled  in  the  social  sciences  is  critical.  Third, 
building  a  model  from  multiple  abstractions  creates  validation  issues  as  the  model  is  built  is 
different  from  the  component  theories  that  have  been  validated.  That  validation  does  not 
automatically  pass  to  the  new  composite  model.  Thus,  there  is  the  question  of  how  one  knows 
whether  or  not  the  predicted  consequences  of  a  composite  model  are  valid.  An  examination  of 
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how  this  is  handled  in  both  the  physical  sciences  and  social  sciences  is  necessary  to  develop  an 
approach  to  validate  enterprise  models. 

It  should  be  noted  that  the  social  sciences  and  the  physical  sciences  follow  drastically  different 
approaches  to  modeling  systems  and  predicting  outcomes.  The  social  sciences  tend  to  be  more 
data  driven  to  find  unintended  consequences,  while  the  physical  sciences  tend  to  be  more 
theory  driven.  Consequently,  the  literature  review  is  broken  into  two  components.  First,  we 
consider  how  multi-scale  modeling  is  handled  in  the  physical  sciences  including  physics, 
chemistry,  biology,  and  engineering.  Second,  we  consider  how  unintended  consequences  are 
detected  in  the  social  sciences.  In  the  social  sciences,  the  challenge  is  that  there  are  often  so 
many  different  abstractions  that  could  be  applicable  that  the  notion  of  organizing  them  by  scale 
becomes  meaningless. 

It  should  be  noted  that  while  the  physical  sciences  and  social  sciences  are  quite  different  in 
terms  of  terminology  and  approach,  at  an  abstract  level,  they  are  quite  similar  in  intent. 
Consequently,  the  increasing  prevalence  of  multi-modeling  may  end  up  merging  the  two 
approaches  in  the  long  run.  However,  in  the  short  run,  the  challenge  of  overcoming  these 
differences  remains. 


5.1  Multi-scale  Modeling  in  the  Physical  Sciences 

Generally  speaking,  models  based  on  well-established  theories  exist  in  different  domains.  As 
models,  they  represent  the  observed  phenomena  in  an  incomplete  way  -  they  are  not  the 
phenomena.  The  main  assumption  behind  multi-scale  modeling  is  that  by  putting  these 
individual  representations  together,  we  are  working  towards  a  more  complete  and  accurate 
representation  of  the  phenomena  of  interest.  This  means  being  able  to  understand  the 
transitions  between  existing  theories  and,  between  models. 

On  a  more  practical  level,  the  use  of  multi-scale  modeling  seems  appropriate  to  whenever 
computational  bottlenecks  associated  with  the  growing  size  of  the  problem  arises  (Brandt 
2002).  For  instance,  if  the  computational  cost  increases  significantly  with  the  number  of 
variables  or,  when  the  number  of  variables  is  so  large  that  linear-scaling  algorithms  would  be 
very  expensive.  The  low-level  resolution  of  most  variables  also  adds  to  these  bottlenecks. 

It  is  important  to  review  and  synthesize  the  literature  on  multi-scale  modeling  as  it  faces  similar 
challenges  albeit  in  different  domains.  The  focus  is  on  natural  sciences  and  engineering  work 
that  provides  relevant  information  on  the  topic,  and,  on  modeling  applications  that  face  major 
roadblocks  due  to  multi-scale  needs.  It  is  organized  as  follows:  after  a  brief  description  of 
different  applications  of  multi-scale  modeling  is  provided,  important  practical  questions  are 
covered.  Specifically,  we  aim  to  understand  how  researchers  select  and  couple  stand-alone 
models  and  prevent  model  overlap  and,  validate  the  resulting  multi-scale  model. 


5.1.1  Multiscale  Modeling  Applications 

In  Physics 
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A  classic  example  of  a  scientific  roadblock  due  to  multi-scale  needs  is  in  physics.  Quantum 
theory  works  remarkably  well  in  all  practical  applications  (Zurek  2002).  States  of  quantum 
systems  evolve  according  to  the  Schrodinger  equation.  Given  the  initial  state,  the  universe 
evolves  to  a  state  of  many  alternatives  -  superposition1  -  never  seen  to  coexist  in  the  world. 
Everyday  objects  are  expected  to  obey  to  quantum  mechanics,  and  their  behavior  described  by 
the  Schrodinger  equation.  After  all,  objects  are  collections  of  atoms.  Yet,  they  seem  to  obey  to 
Newton's  law  instead  (Bhattacharya  et  al  2004).  As  if  classical  dynamics  emerges  in  a  quantum 
world.  This  is  the  quantum-to-classical  transition  problem  and  it  represents  an  example  of  a 
multi-scale  modeling  problem  in  physics. 

Saying  quantum-to-classical  transition  almost  implies  the  existence  of  a  distinct  border  that 
separates  the  applicability  of  both  theories  -  an  important  property  of  a  non-overlapping  multi¬ 
scale  model.  However,  there  is  no  evidence  of  a  border  at  which  the  Schrodinger  equation 
would  fail  (Zurek  2002).  There  is  however,  a  key  aspect  to  the  problem:  macroscopic  systems 
are  never  isolated  from  their  environment.  Therefore,  their  behavior  cannot  follow  the 
Schrodinger  equation  as  it  only  applies  to  closed  systems.  Macroscopic  systems  experience 
what  is  known  as  decoherence  or,  a  loss  of  quantum  coherence  into  the  environment.  The 
environment  induces  a  super-selection  rule  that  prevents  specific  superposition  from  being 
observed.  Consequently,  only  states  that  resist  to  this  process  can  eventually  become  classical. 

Decoherence  as  a  concept  sets  the  desired  border  between  the  quantum  and  classical  theories. 
It  also  justifies  the  emergence  of  classical  behavior  from  a  quantum  model.  But  how  does 
classicality  emerge  from  a  quantum  model?  Quantum  measurements  record  the  potential 
states  of  a  quantum  system.  A  density  matrix2  describes  the  probability  distribution  over  the 
alternative  states.  A  reduction  of  the  state  vector  takes  the  pure-state  density  matrix  and 
cancels  the  off-diagonal  terms  that  represent  purely  quantum  correlations3.  The  reduced 
density  matrix  with  only  classical  correlations  emerges.  The  coefficients  of  the  matrix  can  now 
be  interpreted  as  classical  probabilities  (Zurek  2002).  Important  to  note  that  reduction  of  the 
state  vector  reduces  the  information  available  to  the  observer.  It  may  also  exclude  outcomes 
that  are  to  become  classical  and,  the  initial  conditions  required  to  predict  future  states. 

It  is  believed  that  decoherence  and  quantum-to-classical  transition  result  from  the  interaction 
of  a  system  with  its  environment.  These  considerations  are  based  on  a  specific  model  -  a 
particle  in  a  heat  bath  of  harmonic  oscillators  -,  which  is  a  reasonable  approximate  model  for 
more  complicated  systems  (Zurek  2002).  Physicists  continue  to  work  towards  a  single  multi¬ 
scale  model  (and  theory)  that  explains  the  emergence  of  classical  mechanics  in  a  quantum 
world  for  a  wider  range  of  phenomena.  There  seems  to  be  little  space  for  questions  on  the 
stand-alone  model  selection.  Both  theories  have  existed  for  many  years  and,  have  been  the 
subject  of  experimental  scrutiny.  New  data  will  lead  to  new  assumptions.  Experimental  results 


1  Superposition  is  the  ability  of  an  atom  to  be  in  more  than  one  quantum  state  at  the  same  time. 

2  A  density  matrix  is  the  analogue  to  phase-space  probability  measure  (position  and  momentum)  in  classical 
mechanics. 

3  Purely  quantum  correlations  are  correlations  impossible  to  achieve  when  modeling  a  system  with  classical 
mechanics. 
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might  eventually  confirm  or  contradict  these  assumptions.  Anyway,  any  conclusion  on  a  new 
multi-scale  theory  will  be  based  on  observation.  It  seems  that  until  then,  we  must  use  one 
theory  or  the  other  to  explain  the  same  physical  phenomena  but  at  different  scales. 

In  Biology 

Cancer  is  a  group  of  diseases  characterized  by  uncontrolled  cell  growth  and  tissue  invasion 
(Deisboeck  et  al  2011).  To  model  the  different  carcinogenesis  phases  from  initiation  to 
metastasis  a  multitude  of  multi-scale  processes  must  be  considered.  The  use  of  multi-scale 
modeling  appears  as  a  natural  approach.  The  assumption  is  that  multi-scale  models  have  the 
potential  to  refine  the  existing  hypotheses,  focus  experiments  and,  improve  predictions. 
Thereafter,  improved  predictions  help  in  the  development  of  new  cancer  drugs  and  treatments. 
Some  studies  have  been  successful  in  establishing  a  mechanistic  link  between  some  of  the 
processes  at  different  biological  levels  that  contribute  to  the  different  carcinogenesis  phases 
(Deisboeck  et  al  2011).  We  highlight  two  of  these  studies  next. 

The  first  study  considers  how  abnormal  cell  signaling  at  the  molecular  level  triggers  oncogenic 
transformations.  Briefly,  the  Epidermal  Growth  Factor4  (EGF)  binds  to  the  Epidermal  Growth 
Factor  Receptor5  (EGFR)  and  causes  cells  to  grow  and  differentiate.  The  EGFR  is  found  at 
exceptional  high  levels  on  the  surface  of  many  types  of  cancer  cells.  In  the  presence  of  the  EGF, 
these  cells  may  divide  disproportionately.  Abnormal  activation  of  signaling  pathways  can  result 
in  cancer  initiation  and  progression.  To  model  how  modified  signaling  caused  by  mutations  in 
the  EGFR  triggers  oncogenic  transformations,  researchers  have  focused  on  the  simulation  of 
protein  structure  and  protein-ligand  interactions,  protein  intramolecular  large-scale  motion  and 
protein-membrane  interactions  and,  signal  transduction  (Liu  et  al  2007).  The  spatial  and 
temporal  scales  of  these  simulations  range  from  approximately  10  10  meters  and  10  15  seconds 
to  roughly  10"6  meters  and  10°  seconds.  The  choices  for  the  stand-alone  models  are  molecular 
dynamics,  free  energy  docking,  generalized  Langevin  dynamics,  kinetic  Monte  Carlo  and 
transient  system  dynamics.  We  assume  that  these  are  standard  model  choices  for  the 
simulations  in  question  as  no  justification  on  model  choice  was  provided.  Likewise,  no  detailed 
explanation  was  offered  in  terms  of  a  general  model  coupling  strategy  -  just  that  the  individual 
models  were  coupled  via  their  inputs  and  outputs  ports.  We  presume  that  it  followed  a 
somewhat  trial  and  error  coupling  process.  We  justify  our  assumption  based  on  the  fact  that 
the  study  emphasizes  the  consistency  between  the  multi-scale  simulation  results  and  the 
experimental  observations  (Liu  et  al  2007). 

The  second  study  considers  human  brain  cancer  in  specific.  Human  brain  cancer  cells  proliferate 
or  migrate  but  do  not  exhibit  both  phenotypes  simultaneously.  Experimental  evidence  shows 
that  a  molecular  switch  operates  between  cellular  proliferation  and  migration  in  highly 
malignant  brain  tumor  cells.  The  exact  molecular  mechanism  that  triggers  the  phenotypic 
switch  has  not  been  determined  yet.  A  multi-scale  attempt  to  establish  such  mechanism 
incorporates  a  gene-protein  decision  network  into  a  multi-scale,  agent-based  model  to  simulate 


4  Epidermal  growth  factor  is  a  protein  made  by  cells  and  some  types  of  tumors. 

5  Epidermal  growth  factor  receptor  is  a  protein  found  on  the  surface  of  some  cells. 
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the  division  between  cell  migration  and  proliferation  and,  tumor  growth  across  different  orders 
of  magnitude  (Athale  et  al  2005).  Cancer  cells  are  modeled  as  autonomous  agents  consisting  of 
sub-cellular  sites:  nucleus,  cytoplasm  and  membrane.  These  sites  are  further  decomposed  into 
sub-sites  that  contain  all  the  molecules  in  the  EGFR  signaling  network.  Mass  balance  reactions 
and  reactions  determined  by  the  interaction  network  regulate  the  flow  of  molecules  from  one 
sub-site  to  another.  Ordinary  differential  equations  are  used  to  represent  the  molecular 
concentration  over  time.  To  determine  whether  a  cell  should  migrate  or  not,  a  phenotypic 
decision  threshold  is  established  and  the  migratory  potential  for  each  cell  is  determined.  The 
simulation  results  show  that  cell  proliferation  or  migration  impacts  cancer  expansion  as  a 
whole.  It  also  highlights  experimentally  testable  hypotheses  on  the  sub-cellular  level.  Important 
to  note  that  the  authors  of  the  study  suggest  that  it  is  the  comparison  between  the  multi-scale 
simulation  results  and  experimental  data  that  paves  the  road  for  biological  and  clinical 
discoveries. 

To  conclude  this  subsection,  we  highlight  a  third  multi-scale  study  on  the  development  and 
prevention  of  in-stent  restenosis.  Stenosis  is  an  abnormal  narrowing  in  a  blood  vessel  specially 
a  coronary  artery.  Current  interventions  include  stent-assisted  balloon  angioplasty  where  a 
small,  tubular  mesh  tube  -  stent  -  is  deployed  at  the  site  of  the  stenosis  and  acts  as  a 
mechanical  structure  that  compresses  the  plaque  and  reduces  the  changes  of  vessel  collapse 
(Tahir  et  al  2011).  In-stent  restenosis  is  the  recurrence  of  stenosis  after  the  surgical  intervention 
(i.e.  stent  deployment).  A  sequence  of  multi-scale  processes  takes  place  in  response  to  arterial 
wall  damage:  local  coagulation  (thrombosis)  that  progresses  to  an  inflammatory  stage, 
granulation  tissue  deployment,  smooth  cell  proliferation,  extracellular  matrix  deposition  and 
remodeling  of  the  neointima  (Evans  et  al  2008).  To  understand  how  these  processes  interact 
across  scales  researchers  have  used  a  scale  separation  map  (i.e.  a  graphical  representation 
along  different  scales).  The  availability  of  quantitative  data  on  the  spatial  and  temporal 
characteristics  of  the  processes  is  a  requirement  to  build  the  map. 

The  in-stent  restenosis  problem  has  been  modeled  in  the  following  multi-scale  fashion:  it 
couples  a  lattice  Boltzmann  bulk  flow  solver  for  the  blood  flow,  an  agent-based  model  for  the 
smooth  muscle  cell  dynamics,  and  a  Finite  Difference  model  for  the  drug  diffusion  from  the 
stent  and  within  the  cellular  tissue  (Caiazzo  et  al  2011).  A  kernel  simulates  the  deployment  of 
the  stent  into  the  cellular  tissue  and  generates  the  initial  conditions.  The  coupling  between  the 
individual  models  is  accomplished  via  conduits  and  mappers  (Chopard  et  al  2014).  A  conduit  is  a 
one-way,  point-to-point  communication  that,  in  this  case,  converts  the  positions  and  radii  of 
cell  agents  (smooth  muscle  cell  model)  into  a  computational  mesh6  for  the  flow  solver,  which  is 
decomposed  into  fluid  and  solid  nodes  (Caiazzo  et  al  2011).  Another  conduit  converts  the  same 
positions  and  radii  into  a  computational  mesh  for  the  drug  diffusion  solver.  Mappers  are  multi- 
port  data  transformation  agents7  that,  in  this  particular  application,  take  the  output  of  the  bulk 
flow  solver,  of  the  drug  diffusion  model,  and  the  present  cell  configuration  to  compute  the 


6  A  mesh  is  a  discretization  of  a  geometric  domain  into  small  and  simples  shapes  such  as  triangles  for  2D 
and  tetrahedral  for  3D. 

7  Multi-port  data  transformation  agents  combine  inputs  from  multiple  conduits  and  produce  multiple  outputs. 
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shear  stress  on  each  cell.  Similarly  to  the  development  of  the  scale  separation  map, 
quantitative  data  also  informs  the  multi-scale  coupling  required. 


In  Chemistry 

Two  examples  from  the  literature  are  included  here:  a  homogeneous-heterogeneous  chemical 
reactor  model  and  a  hybrid  multi-zonal/computational  fluid  dynamics  model  of  chemical 
process  equipment  (Vlachos  1997;  Bezzo  &  Macchietto  2004;  Yang  &  Marquardt  2009). 

Homogeneous  and  heterogeneous  processes  are  frequent  in  many  areas  such  as  catalysis, 
electrochemistry  and  corrosion  (Vlachos  1997).  Some  of  these  processes  include  the  transport 
of  reactants  and  products  in  a  fluid  boundary  layer,  homogeneous  reactions  in  the  fluid  phase 
and,  heterogeneous  reactions  on  a  surface.  Consider,  for  example,  a  model  of  a  homogeneous- 
heterogeneous  chemical  reactor  (Vlachos  1997).  The  reactor  model  includes  the  homogeneous 
bulk  fluid  phase  and  the  heterogeneous  solid  catalyst  surface  -  a  partially  overlapping 
decomposition.  Absorption,  reaction,  and  desorption  of  molecules  occur  at  the  solid  surface. 
The  reactants  of  the  surface  reaction  in  the  fluid  phase  diffuse  towards  the  solid  surface.  In  the 
opposite  way,  the  products  of  the  surface  reactor  in  the  solid  surface  diffuse  into  the  bulk  fluid 
phase  (Yang  &  Marquardt  2009).  In  order  to  generate  necessary  kinetics  information  the 
surface  can  be  decomposed  into  a  molecular  lattice. 

Hybrid  multi-zonal/computational  fluid  dynamics  models  are  appropriate  to  model  chemical 
process  equipment  (Yang  &  Marquardt  2009).  The  goal  is  to  decompose  the  system  into 
different  scales  in  order  to  simplify  computations.  Consider  a  piece  of  chemical  equipment.  As  a 
first  step,  the  equipment  is  decomposed  into  a  number  of  zones  each  of  which  is  characterized 
by  a  number  of  variables  such  as  temperature,  pressure,  etc.  To  show  the  heterogeneity  of  the 
overall  space,  these  variables  have  distinct  values  in  different  zones.  The  zone-variable 
assignment  precedes  a  further  decomposition  of  the  zones  into  cells.  The  cells  also  have  their 
own  set  of  variables;  however,  these  variables  are  constrained  by  the  ones  at  the  zone  level. 
Precision  is  the  main  driver  behind  the  decomposition  process:  whenever  the  phenomena  to  be 
modeled  require  accurate  representation,  the  decomposition  continues  down  to  the  cell  level. 
For  phenomena  that  require  a  lesser  precise  representation,  the  reduction  to  zones  suffices. 

The  multi-zonal  model  maps  the  overall  space  and  it  is  independent  of  the  model  for  each  zone. 
Rather,  it  is  topologically  defined  with  respect  to  the  interfaces  that  connect  the  zones  (Bezzo  & 
Macchietto  2004).  The  characterization  of  the  flux  of  material  between  zones  and  the 
properties  that  result  from  fluid  mechanical  mixing  processes  are  used  to  determine  the 
coupling  scheme.  The  properties  can  either  be  determined  from  the  computational  fluid 
dynamics  models  or,  if  required  by  these  models,  determined  by  the  multi-zonal  model. 

In  Material  Sciences 


In  materials  sciences,  it  is  common  to  distinguish  between  different  length  scales:  the  atomic 
scale,  the  microscopic  scale,  the  mesoscopic  scale  and,  the  macroscopic  scale.  The  main  players 
at  each  of  the  scales  are  electrons,  atoms,  lattice  defects  (such  as  dislocations  and  grain 
boundaries)  and,  continuum  fields  (such  as  density,  velocity,  temperature,  displacement  and 


Report  No.  SERC- 2017- TR- 106 


28 


Date  April  30,  2017 


stress  fields),  respectively  (Lu  &  Kaxiras  2004).  Well-established  and  efficient  computational 
approaches  that  model  phenomena  at  each  scale  have  been  developed  over  the  years. 
Individually,  neither  approach  suffices  to  describe  multi-scale  phenomena.  For  example,  a  full 
atomistic  description  of  material  defects  alone  does  not  describe  the  observed  macroscopic 
behavior;  higher  scale  defect  interactions  do  (Curtin  &  Miller  2003).  In  fact,  material  sciences 
applications  are  "hitting  the  bounds  of  single-scale  models  in  both  time  and  length  scales" 
(Germann  &  Randles  2012).  And  so,  the  challenge  in  material  sciences  simulation  becomes  how 
to  combine  the  available  stand-alone  models  to  tentatively  tackle  unanswered  questions  in  the 
field. 

Conceptually,  two  multi-scale  approaches  in  material  sciences  can  be  envisioned:  a  sequential 
approach  or  a  concurrent  approach  (or  both  simultaneously)  (Lu  &  Kaxiras  2004).  The 
sequential  approach  does  not  couple  individual  models  directly  but  passes  critical  information 
such  as  material  properties  from  atomistic  models  to  continuum  ones.  The  concurrent 
approach  couples  atomistic  and  continuum  models  explicitly,  which  allows  for  an  atomistic 
description  of  critical  regions  and  a  coarser  description  of  the  more  uniform  regions  away  from 
the  critical  ones  (Miller  &  Tadmor  2009).  The  goal  of  any  of  the  approaches  is  to  predict  the 
performance  and  behavior  of  materials  across  space  and  time  scales  and  to  make  the  best 
compromise  between  accuracy,  efficiency  and,  realistic  description.  To  illustrate  both 
strategies,  consider  the  Peierls-Nabarro  model  of  dislocations  and  the  macroscopic-atomistic 
Ab  initio  dynamics  approach  (Lu  &  Kaxiras  2004). 

Dislocations  are  an  important  concept  in  the  understanding  of  the  mechanicals  properties  of 
crystalline  solids  (Lu  &  Kaxiras  2004).  Continuum  elasticity  theory  explains  the  long-range  elastic 
strain  of  a  dislocation  beyond  a  few  lattice  spacings.  In  close  proximity  to  the  dislocation  core 
such  explanation  falls  apart.  The  Peierls-Nabarro  model  of  dislocations  addresses  this  problem 
by  incorporating  a  discrete  dislocation  core  structure  into  a  continuum  framework.  To  illustrate 
it,  consider  a  solid  with  an  edge  dislocation  in  the  middle  i.e.  two  elastic  half-spaces  linked  by 
atomic  forces  across  a  common  interface.  The  goal  of  the  Peierls-Nabarro  model  is  to  compute 
the  slip  distribution8  on  the  interface  that  minimizes  the  total  energy  (Lu  &  Kaxiras  2004).  To  do 
so,  the  elastic  energy  that  is  stored  in  both  half-spaces  due  to  the  dislocation  and  the  nonlinear 
potential  energy  that  results  from  atomistic  interactions  across  the  interface  must  be 
determined  (as  these  contribute  to  the  total  energy).  As  previously  mentioned,  elasticity  theory 
determines  fairly  well  the  elastic  energy  in  both  half-spaces.  The  limitation  arises  at  the 
interface.  Classical  interatomic  potentials  (or,  alternatively,  ab  initio  calculations9)  are  used  to 
determine  the  potential  energy  due  to  the  atomistic  interactions.  This  atomistic  information 
feeds  into  the  coarse-grained  continuum  framework,  making  for  a  sequential  multi-scale 
approach  to  the  dislocation  problem. 

Unlike  dislocation,  the  study  of  fracture  dynamics  is  better  approached  with  a  concurrent 
strategy  (Lu  &  Kaxiras  2004).  The  reason  is  that  fracture  phenomena  result  from  dynamic 


8  Slip  distribution  (or  relative  displacement)  is  a  measure  of  the  misfit  across  the  interface  and  it 
characterizes  the  dislocation. 

9  Ab  initio  calculations  are  calculations  from  basic  and  well-established  laws  of  nature. 
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interactions  between  multi-scale  processes,  which  contribute  to  the  total  fracture  energy. 
Consider  the  macroscopic-atomistic  Ab  initio  dynamics  approach  applied  to  the  dynamical 
fracture  process  in  Si  (brittle  material).  At  the  crack  tip  region,  atomic  bonds  break  and  form; 
around  the  crack  tip  region,  atomic  bonds  do  not  break  but  there  are  significant  strain 
gradients;  at  the  far-field  region,  atomic  displacements  and  strain  gradients  are  small.  At  each 
of  the  regions,  well-established  and  tested  methods  apply.  Specifically,  the  macroscopic- 
atomistic  Ab  initio  dynamics  approach  links  a  quantum-mechanical  tight-binding  approximation 
method  to  a  classical  molecular  dynamics  method  to  a  continuum  finite  element  method, 
respectively.  In  between  the  regions,  coupling  or  handshaking  algorithms  tackle  the  transitions. 
For  the  finite  element  -  molecular  dynamics  transition,  the  algorithm  scales  down  the  finite 
element  mesh  size  to  atomic  dimensions  (or  expands  it  if  in  the  opposite  direction).  For  the 
molecular  dynamics  -  tight-binding  transition,  there  are  fictitious  atoms  situated  directly  on  the 
top  of  the  atoms  on  the  molecular  dynamics  region.  On  one  side  of  the  interface,  the  bonds  to 
an  atom  are  deduced  from  the  tight-binding  Flamiltonian;  on  the  other  side,  the  bonds  are 
derived  from  the  interatomic  potential  of  the  molecular  dynamics  simulation  (Lu  &  Kaxiras 
2004). 


5.1.2  Multi-scale  mathematics 

The  previous  section  illustrates  some  of  the  scientific  roadblocks  due  to  multi-scale  needs. 
These  are  domain-  or  application-driven  needs  and  can  be  summed  up  to  the  integration  of 
heterogeneous  models  and  data  that  describe  multi-scale  phenomena  of  interest.  Great 
complexity  results  from  the  many  variables  and  interactions  between  heterogeneous  models 
and  data.  Some  studies  have  demonstrated  that  scale-born  complexities  can  be  overcome  or, 
reduced,  by  multi-scale10  algorithms  (Dolbow  et  al  2004).  This  section  provides  an  overview  of 
some  of  these  algorithms.  Fundamental  to  most  algorithms  are  mathematical  subjects  such  as 
error  estimation  methods  (to  estimate  error  propagation  across  models  and  scales  which  can 
result  from  model  mismatch  and  the  coupling  process  for  example),  uncertainty  quantification 
methods  (to  characterize  and  quantify  sources  of  uncertainty  and  to,  together  with  error 
estimates,  identify  the  proper  scale  resolutions  in  adaptive  methods  and  obtain  information  on 
the  model  solutions  and  their  reliability),  inverse  and  optimization  methods  (to  identify  model 
parameters  and  control  mechanisms)  and,  dimensional  reduction  methods  (to  simplify  models 
with  high-dimensional  state  or  input  parameter  spaces  to  essential  dimensions  and  modes  - 
reduction  in  the  number  of  degrees  of  freedom)  (Dolbow  et  al  2004).  These  topics  will  not  be 
addressed  in  detail  here. 

Multiresolution  methods 


The  goal  of  multiresolution  methods  is  to  decompose  objects  into  terms  resolving  different 
scales  or  resolutions  for  the  purpose  of  analysis,  approximation,  compression  or  processing 
(Kunoth  2015).  The  objects  can  be  given  explicitly  in  the  form  of,  for  example,  time  series  or 
image  data,  or  implicitly  as  solutions  of  partial  differential  equations.  Consider,  for  example,  a 
univariate  function  /  that  exists  on  a  finite  interval  [0,  T]  c  R  and  that  describes  a  given 


10  Multi-scale  or  multi-resolution  or  multi-level  or  multi-grid  algorithms. 
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object.  The  goal  of  a  multiresolution  method  is  to  find  a  decomposition  f(t)  for  j  scales  or 
resolutions: 


fit)  =  G  [0 ,T],j  G  N 

7=0 

An  example  of  a  classical  decomposition  is  the  Fourier  analysis  (Kunoth  2015).  Fourier  analysis 
converts  signals  from  their  original  domain,  oftentimes  time  or  space,  to  a  representation  in  the 
frequency  domain  (and  vice  versa).  In  this  case,  the  multiscale  components  are  of  the  form 

Qj(t)  =  ajexp(iWjt) 

where  Wj  are  the  frequencies  and  CLj  are  the  constant  amplitudes  to  be  determined  from  / 
using,  for  instance,  the  Fourier  transform11.  Note  that,  in  this  particular  example,  the  multiscale 
components  gj  are  of  a  specific  form  and  all  are  of  the  same  format. 


Some  multiresolution  methods  include  multiresolution  analysis  and  multiscale  geometric 
analysis,  and  multigrid  and  algebraic  multigrid  (Dolbow  et  al  2004). 

Multiresolution  analysis  and  multiscale  geometric  analysis 

Given  a  basis  function  (p  in  L2  (E),  we  consider  the  scales  and  translations  of  (p\  (pJk  for  j  and  k 
in  TL.  The  subset  of  L2  (E)  describable  by  a  linear  combination  of  the  set  of  functions  (pJk  can  be 
written  as: 


Vj  —  span{(pJk\k  E  Z] 

If  every  f  E  L2  (E)  can  be  arbitrarily  correctly  approximated  by  the  set  of  (pk  and  q)  fulfills  a 
refinement  equation,  we  say  that  cp  or  the  Vfs  build  a  multiresolution  analysis  (Schneider  and 
Kruger,  2007).  Assume  that  cp  fulfills  a  refinement  equation,  Vj  c  Vj+1  for  every  j  E  TL.  As  such, 
there  is  the  orthogonal  space  Wj  of  Vj  in  Vj+1  and: 

Vj  ©  Wj  =  Vj+1 

Wj  is  called  the  detail  space  or  the  wavelet12  space  for  Vj.  So,  given  a  level  J  we  want  to 
approximate,  we  have: 


Vj  =  V/_10M7/_1 


11  The  Fourier  transform  decomposes  a  function  of  time  (signal)  into  the  frequencies  that  make  it  up. 

12  A  wavelet  is  a  short  wavelike  function  that  can  be  scaled  and  translated. 
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=  Vj_2@Wj^2®^]-1 


=  Vo  e  0  'rlWj 


Multiresolution  expansion  based  on  wavelets  has  a  great  number  of  successful  applications  in 
data  compression  and  noise  removal  (Donoho  2002).  The  rapid  increase  in  new  data  sources, 
each  different  from  the  other,  sustains  the  need  for  expansions  that  uniquely  adapt  to  each 
data  type.  Wavelet  analysis  is  exceptional  for  representing  smooth  data  containing  point 
singularities  but  not  singularities  of  intermediate  dimensions,  which  in  some  cases  represent 
important  features.  This  suggests  that  in  higher  dimensions,  wavelets  do  not  suffice  and  there  is 
a  need  for  a  geometric  multiscale  analysis.  Previous  attempts  in  this  direction  discuss  two 
approaches  to  geometric  multiscale  analysis:  a  directional  wavelet  transform  based  on 
parabolic  dilations  and,  analysis  via  anistropic  strips  (Donoho  2002). 


Multigrid  and  algebraic  multigrid  methods 

Mathematical  models  in  science  and  engineering  make  extensive  use  of  differential  equations 
to  solve  problems  (Wesseling  1995).  Multigrid  methods  target  the  algorithmic  efficiency  to 
solve  differential  equations.  It  usually  starts  with  the  application  of  a  smoother  (or  relaxation 
method),  which  is  often  a  simple  iterative  method  such  as  the  Jacobi  or  Gauss-Seidel  method 
(Falgout  2006).  The  goal  is  to  have  a  smooth  error  in  a  few  iterations  and  then  to  move  to  a 
coarser  grid  on  which  the  remaining  error  can  be  removed.  The  steps  of  the  coarse-grid 
correction  process  are  1)  to  transfer  information  to  a  coarser  grid,  2)  solve  a  coarse-grid  system 
of  equations  and,  3)  transfer  the  solution  back  to  the  fine  grid.  To  illustrate,  consider  we  want 
to  solve  for: 


Ahu  —  b 

where  Ah  is  the  (original)  real  nxn  matrix  on  fine  mesh  and  u  and  b  are  vectors  in  Rn.  The  key 
components  to  multigrid  are  a  restriction  matrix  R  and  an  interpolation  matrix  I  that  change 
the  grids  (Strang  2006): 

1.  A  restriction  matrix  R  transfers  vectors  from  the  fine  to  the  coarse  grid 

2.  The  return  to  the  fine  grid  is  done  by  an  interpolation  matrix  /  =  I2h 

3.  The  original  matrix  Ah  on  the  fine  grid  is  approximated  by  A2h  =  RAh I 

Note  the  use  of  a  convenient  ratio  2  for  grid  spacing  ( h ,  2 h  ... ).  Despite  the  fact  that  it  is 
possible  to  have  different  spacing,  for  example  hx  and  hy  in  two  dimensions,  having  a  single 
mesh  width  h  is  easier  to  visualize  (Strang  2006). 
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The  algebraic  multigrid  method  solves  linear  systems  based  on  the  same  multigrid  concepts  just 
described:  smoothing  and  coarse-grid  correction.  The  difference  is  that  algebraic  multigrid  does 
not  require  explicit  knowledge  of  the  problem  geometry.  Instead,  it  is  a  matrix-based  method. 
To  illustrate,  consider  we  want  to  solve  for: 


Au  —  b 

where  A  is  the  (original)  real  nxn  matrix  and  u  and  b  are  vectors  in  Rn.  In  linear  algebra,  the 
operators  that  transfer  information  between  the  fine  and  coarse  grids  are  denoted  as  the 
vector  space  Rn  and  the  lower-dimensional  (coarser)  vector  space  Rnc.  Also,  the  map  from  the 
coarse  to  the  fine  grid  (interpolation)  is  denoted  as  the  nxnc  matrix  P:  Rnc  ->  Rn  and  the  map 
from  the  fine  to  the  coarse  grid  (restriction)  is  the  transpose  of  interpolation,  PT .  The  two-grid 
method  for  solving  Au  =  b  is  next  described  (Falgout  2006): 

1.  Dov1  smoothing  steps  on  Au  —  b 

2.  Compute  residual  r  —  b  —  Au  -  Ae 

3.  Solve  Acec  —  PTr 

4.  Correct  u  <-  u  +  Pec 

5.  Do  v2  smoothing  steps  on  Au  —  b 

A  few  remarks  about  the  above  algorithm  (Falgout  2006).  Error  e,  is  the  difference  between  the 
precise  solution  and  the  current  iterate:  e  —  A-1b  —  u.  In  3.,  we  solve  for  ec,  the  coarse 
approximation  to  error  e  (in  practice,  the  coarse  system  is  solved  by  recursively  re-applying  the 
method).  The  most  popular  approach  to  determine  the  coarse  system  Ac  is  to  use  the  Galerkin 
operator,  Ac  —  PT  AP . 

Hybrid  methods 

The  goal  of  hybrid  methods  is  to  couple  models  and  numerical  representations  across  different 
scales  and  over  contiguous  domains  (Dolbow  et  al  2004).  The  stand-alone  models  are  not 
required  to  use  a  multi-resolution  method.  The  exchange  of  information  between  these  models 
has  to  accommodate  for  potential  discrete  to  continuum  and  stochastic  to  deterministic 
information  type  differences.  The  coupling  strategy  in  hybrid  methods  highly  depends  on 
information  concerning  error  and  uncertainty.  This  information  is  fundamental  to  adaptively 
choose  between  the  available  algorithms  and  parameters  during  runtime,  i.e.  to  select  the 
coupling  form  and  strength  and,  to  highlight  space  and  time  regions  where  better  descriptions 
are  necessary.  Some  hybrid  methods  include  partitioned-domain  methods,  hierarchical 
methods,  and  sequential  and  concurrent  coupling  methods  (Dolbow  et  al  2004). 

Partitioned-domain  methods 
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Partitioned-domain  methods  rely  on  the  atomistic  and  continuum  partitioning  of  models. 
Fundamental  to  this  type  of  partitioning  is  the  computation  of  the  total  energy  of  a  system  as  a 
function  of  the  degrees  of  freedom,  i.e.  the  atoms  or  the  finite  element  nodal  position  (Curtin  & 
Miller  2003).  To  determine  energies  and  forces  on  individual  atoms,  it  is  common  to  use 
classical  interatomic  potentials,  in  which  the  total  atomic  energy  Ea  can  be  obtained  as: 

i 

where  Et  is  the  energy  of  the  / th  atom.  Classical  interatomic  potentials  are  mostly  used  in  the 
embedded-atom  method  or  the  Stillinger-Weber  framework  (Curtin  &  Miller  2003).  According 
to  the  embedded-atom  method,  the  energy  of  an  atom  /  is  given  by: 

Ei  =  Ftipd  + 

j*i 

where  Et  is  an  electron-density  dependent  embedding  energy,  Vtj  is  a  pair  potential  between 
atom  /'  and  the  neighboring  atom  j  and,  is  the  interatomic  distance  (Curtin  &  Miller  2003). 

The  electron  density  at  atom  /,  denoted  as  pir  is  the  superposition  of  density  contributions  from 
each  one  of  the  pj  neighbors: 


Pi  =  YjPjfcj) 

j*i 

According  to  the  Stillinger-Weber  framework,  the  energy  of  an  atom  /  can  be  obtained  as: 

=  2^j  ^0'(ry)  +  ^ijk(rij’rik) 

j=ti  j*i 

where  Vijk  is  the  three-body  potential  and  is  the  vector  from  atom  /  to  neighbor  atom  j 
(Curtin  &  Miller  2003).  When  one  atom  is  displaced  in  an  atomistic  simulation,  the  interaction 
energies  are  assumed  to  extend  within  the  range  Rcut  along  the  neighbor  distances.  In  the 
absence  of  externally  applied  forces,  the  forces  on  atom  /  can  be  obtained  as: 

_  dEa({r1 . . .  r j\j } ) 

!i  drt 

The  total  energy  and  the  forces  on  each  atom  allow  us  to  determine  the  equilibrium  atomic 
configuration  as  a  function  of  applied  forces  and  imposed  displacement  on  the  atoms. 

Moving  to  the  continuum  side  of  the  partitioning,  continuum  mechanics  assumes  that  a  strain 
energy  density  functional  W  exists  for  a  material  and,  the  energy  in  an  incremental  volume  dV 
around  point  X  is  W(X)dV  (Curtin  &  Miller  2003).  The  overall  potential  energy  of  the  material, 
Ec  is  obtained  as  the  integral  over  the  volume  fi  of  the  body: 
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Ec  =  W{X)dV 
■Jn. 


To  determine  the  equilibrium  strain  field  related  to  applied  forces  and  displacements  in  the 
body,  the  overall  potential  energy  Ec  must  be  minimized  (Curtin  &  Miller  2003).  A  common 
method  for  the  minimization  is  the  finite  elements  (FE)  method.  The  idea  is  to  determine  the 
displacements  Uj  —  u (Xf)  at  a  set  of  points  Xj  (the  nodes  j).  Predetermined  interpolation  or 
shape  functions  are  used  to  determine  the  displacements  at  locations  away  from  the  nodes. 
Elements  result  from  defining  polyhedral  regions  with  the  nodes  at  vertices  so  that  the  space  is 
covered  by  elements.  The  total  energy  of  the  continuum  region  can  then  be  obtained  as  the 
sum  over  the  elements  p\ 


Ec  = 


l 


WdV  = 


Ne 


E„=  W(X)dV 

where  Ne  is  the  number  of  elements  in  the  region  and  the  volume  of  element  p.  The  strain 
energy  density  functional  W  depends  on  the  deformation  gradient  F.  Changes  in  the 
displacement  node  /'  trigger  changes  in  the  deformation  gradients  F  of  all  the  elements  in  which 
node  /  is  contained  and,  therefore,  a  change  in  the  total  energy.  The  energies  of  the  other 
elements  are  assumed  to  remain  the  same  (Curtin  &  Miller  2003). 

A  critical  aspect  of  partitioned-domain  methods  is  the  transition  region  between  the  atomistic 
and  continuum  partitioning.  There  is  no  unified  and  formal  theory  of  the  transition  that 
establishes  quantifiable  error  bounds  (Curtin  &  Miller  2003).  There  are  however,  different  ways 
to  handle  it.  For  example.  Miller  and  Tadmor  (2009)  consider  the  idealized  partitioning  of  a 
domain  problem  into  BA  and  Bc  that  represent  the  atomistic  and  continuum  partitions  of  the 
problem,  respectively.  The  interface  between  regions  BA  and  Bc  is  denoted  as  B1  across  which 
compatibility  and  equilibrium  are  imposed. 

A  common  strategy  to  bridge  regions  BA  and  Bc  is  to  divide  the  interface  region  B1  into  the 
"handshake  region"  BH  and,  the  "padding  region"  Bp.  BH  is  both  atomistic  and  continuum.  Bp  is 
continuum  but  it  is  used  to  create  atoms  that  generate  the  boundary  conditions  to  the  atoms  in 
regions  BA  and  BH  (Miller  &  Tadmor  2009).  Such  requirement  results  from  the  nonlocal  nature 
of  atomic  bonds.  The  range  of  these  atomic  interactions,  Rcut,  determines  the  thickness  of  the 
padding  region.  The  continuous  displacement  fields  at  the  position  of  the  padding  atoms  in  the 
padding  region  determine  their  motion.  A  variation  to  this  interface  is  to  eliminate  the 
handshake  region. 

A  large  number  of  partitioned-domain  methods  have  been  proposed  in  the  literature  (Miller  & 
Tadmor  2009).  These  are  the  quasicontinuum  method,  the  coupling  of  length  scales  method. 
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the  bridging  domain  method,  the  bridging  scale  method,  the  composite  grid  atomistic 
continuum  method,  the  cluster-energy  quasicontinuum  method,  the  ghost-force  corrected 
quasicontinuum  method,  the  ghost-force  corrected  cluster-energy  quasicontinuum  method, 
the  finite  element/atomistics  method,  the  coupled  atomistics  and  discrete  dislocations  method, 
the  hybrid  simulation  method,  the  concurrent  AtC  coupling  method,  the  ghost-force  corrected 
concurrent  AtC  coupling  method,  and  the  cluster-force  quasicontinuum  method.  Although 
theoretically  different,  and  with  the  exception  of  the  coarse-grain  molecular  dynamics  method, 
these  methods  are  similar  at  the  level  of  implementation  (Miller  &  Tadmor  2009).  Differences 
exist  in  terms  of  the  governing  formulation,  the  coupling  boundary  conditions,  the  handshake 
region,  and  the  treatment  of  the  continuum.  A  thorough  comparison  of  these  methods  in  terms 
of  accuracy  and  efficiency  of  the  coupling  led  to  a  unified  framework  where  the  different 
methods  can  be  represented  as  special  cases  (Miller  &  Tadmor  2009). 

Hierarchical  methods 


In  hierarchical  methods,  numerical  techniques  are  independently  employed  at  different  length 
scales.  A  bridging  methodology  such  as  statistical  analysis  methods,  homogenization 
techniques,  or  optimization  methods  can  then  be  used  to  identify  the  relevant  cause-and-effect 
relations  at  the  lower  scale  and  their  impact  at  the  higher  scale  (Horstemeyer  2009).  An 
example  of  a  top-down  hierarchical  approach  is  the  use  of  thermodynamically  constrained 
internal  state  variables  at  the  macroscale  that  reach  down  and  receive  information  from 
multiple  subscales.  This  way,  the  internal  state  variables  macroscopically  average  the  details  of 
the  microscopic  configurations  and  capture  their  effects  but  not  the  causes  at  the  local  levels. 
The  assumption  is  that  the  complete  microscopic  arrangement  is  not  required  as  long  as  the 
macroscale  internal  state  variables  representation  is  complete  (Horstemeyer  2009). 

Sequential  versus  concurrent  coupling 

The  goal  of  sequential  coupling  methods  is  to  obtain  a  macroscopic  model  from  which  the 
macroscopic  behavior  of  systems  can  be  analyzed  under  different  conditions  (Weinan  2011). 
Microscopic  models  precompute  or  tabulate  some  of  the  functions  or  parameters  that  are 
inputs  to  the  macroscopic  models  (can  too  be  interpolated).  An  example  of  sequential  coupling, 
also  called  precomputing,  microscopically  informed  modeling,  or  parameter  passing,  is  found  in 
gas  dynamics.  Kinetic  theory  can  be  used  to  precompute  the  equation  of  state,  which  is  stored 
in  a  look-up  table  and  later  used  in  Euler's  equations  of  gas  dynamics  to  simulate  gas  flow 
under  different  conditions.  Other  examples  include  the  study  of  macroscopic  properties  of 
fluids  and  solids  that  use  parameters  obtained  successively  from  quantum  mechanics  models.  A 
sequential  approach  is  not  feasible  when  the  unknown  components  of  the  macroscopic  model 
(parameters  or  functions)  depend  on  many  variables.  For  example,  in  molecular  dynamics 
theory,  the  interatomic  forces  depend  on  the  positions  of  all  the  atoms  in  the  system.  However, 
it  is  impractical  to  precompute  these  forces  as  functions  of  the  atomic  position  for  more  than 
ten  atoms  (Weinan  2011).  Concurrent  coupling  offers  an  alternative  in  which  the  unknown 
components  are  obtained  "on  the  fly"  as  the  computation  evolves.  For  most  numerical 
(concurrent)  methods,  the  macroscale  quantities  of  interest  are  obtained  from  appropriate 
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microscale  models  and  not  from  ad  hoc  macroscale  models.  Oftentimes,  the  technicalities  seen 
in  a  sequential  or  concurrent  approach  are  comparable.  The  right  approach  to  use  depends  on 
how  much  it  is  known  about  the  macroscale  process. 

Heterogeneous  multiscale  method 

The  heterogeneous  multi-scale  method  is  a  top-down  approach  that  relies  on  the  efficient 
coupling  between  macroscopic  and  microscopic  models  (Weinan  et  al  2007).  The  available 
macroscopic  information  about  the  process  under  consideration  (i.e.  macroscopic  variables  and 
structure)  is  first  entered  in  the  macroscopic  model.  Examples  of  information  include 
variational  structure,  conservation  laws,  diffusion  processes,  etc.  The  microscale  models  have 
to  be  consistent  with  the  just  identified  macroscale  structure.  In  the  context  of  fluids  and  solids, 
for  instance,  this  means  to  derive  conservation  laws  from  molecular  dynamics  and  to  express 
the  stress  in  atomistic  variables. 

The  general  setting  can  be  described  as  follows  (Weinan  et  al  2007).  Consider  a  microscopic 
system  and  a  microscale  model,  which  can  be  abstractly  described  as: 

f(u,b )  =  0 

where  u  is  the  system's  state  variable  and  b  represent  the  auxiliary  conditions  of  the  problem 
(e.g.  initial  and  boundary  conditions).  The  microscopic  details  of  it  are  of  no  interest;  instead, 
we  want  to  perceive  the  macroscopic  state  of  the  system,  U,  which  satisfies  some  abstract 
macroscopic  equation: 


F{U,D )  =  0 

where  D  represents  the  necessary  macroscopic  data  for  the  macroscopic  model  to  be 
complete.  Assume  that  the  compression  operator  Q  maps  it  to  U,  and  an  operator  R 
reconstructs  it  from  U : 


Qu  —  U 


RU  =  it 


and  that  QR  —  I  is  satisfied  (/  stands  for  the  identity  operator).  The  purpose  of  the 
heterogeneous  multiscale  method  is  to  determine  U  using  the  abstract  macroscopic  equation  F 
and  the  microscale  model.  Despite  the  incompleteness  of  the  macroscopic  model,  a 
macroscopic  solver  must  be  selected  -  all  information  on  the  form  of  F  is  used  to  do  so.  To 
estimate  the  required  macroscale  data,  a  series  of  constrained  microscale  simulations 
consistent  with  the  local  macroscopic  state,  i.e.  b  —  b(U )  follow  and,  the  microscopically 
generated  data  is  next  used  to  extract  the  required  macroscale  data.  Data  estimation  can  be 
conducted  "on  the  fly"  or  in  a  pre-processing  step  (likewise  in  the  concurrent  and  sequential 
coupling  methods,  respectively).  There  is  no  direct  communication  between  the  microscale 
models;  all  communications  are  performed  through  the  macroscale  solver  (Weinan  et  al  2007). 
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Adaptive  methods 


The  goal  of  adaptive  methods  is  to  minimize  the  error  and  uncertainty  in  simulation  and  data 
representation  (Dolbow  et  al  2004).  There  is  a  trade-off  between  the  efficiency  of  a  coarse  scale 
simulation  and  the  precision  of  a  detailed  one.  The  computational  effort  is  locally  adjusted  to 
keep  a  uniform  level  of  precision  throughout  the  problem  domain  (Colella,  n.d.).  While  adaptive 
methods,  in  specific  the  adaptive  mesh  and  algorithm  refinement,  may  look  similar  to  hybrid 
methods,  there  are  fundamental  differences  between  them  (Garcia  et  al  1999).  For  instance, 
the  adaptive  mesh  and  refinement  algorithm  explicitly  works  as  a  multi-level  method  that 
simulates  systems  whose  length  scales  are  of  considerable  orders  of  magnitude.  Furthermore,  it 
is  fully  three-dimensional  whereas  some  hybrid  methods  are  limited  to  the  simulation  of  one- 
or  two-dimensional  problems. 

Adaptive  mesh  refinement 

Adaptive  mesh  refinement  is  a  numerical  method  for  solving  a  class  of  partial  differential 
equations  in  one  or  more  dimensions  (Berger  &  Colella  1989).  It  hinges  on  a  series  of 
embedded,  logically  rectangular  grids  on  which  the  partial  differential  equation  is  discretized. 

An  error  estimation  procedure  based  on  user-specified  criteria  determines  where  additional 
refinement  is  necessary  (Garcia  et  al  1999).  Grid  generation  procedures  dynamically  create  or 
eliminate  finer  grid  patches  as  resolution  requirements  vary.  To  illustrate,  consider  a  sequence 
of  levels  l  =  1, ... ,  lmax  and  define  a  grid  Gt: 


Gt  = 


u 


G 


l,k 


where  grid  Gi  k  has  mesh  spacing  ht.  Each  grid  is  a  subset  of  the  rectangular  discretization  of 
the  entire  space  (Berger  &  Colella  1989).  Overlapping  grids  at  the  same  level  Z  are  possible;  yet, 
the  discrete  solution  must  be  independent  from  the  decomposition  of  level  Z.  Grids  at  different 
levels  must  be  properly  embedded.  Specifically,  a  fine  grid  starts  and  ends  at  the  corner  of  a  cell 
of  the  next  coarser  grid  and  there  must  be  at  least  one  level  l  —  1  cell  in  some  level  Z  —  1  grid 
that  separates  a  grid  cell  at  the  coarser  grid  Z  from  a  cell  at  the  finer  grid  Z  —  2  in  the  north, 
south,  east,  and  west  directions  (Berger  &  Colella  1989).  The  exception  is  when  the  cell  is  on 
the  border  of  the  physical  boundary  of  the  domain. 

Grids  with  finer  mesh  width  in  space  will  also  have  smaller  mesh  width  in  time  (Berger  1982).  In 
other  words,  refinement  is  done  in  both  space  and  time  by  the  same  refinement  ratio  (Berger  & 
Colella  1989).  The  mesh  refinement  algorithm  includes  an  error  estimation  procedure  and  an 
integration  algorithm.  There  are  three  components  in  the  integration  algorithm:  the  actual  time 
integration  on  each  cell  (application  of  finite  differences),  the  error  estimation  and  consecutive 
grid  creation  and,  the  grid-to-grid  operations  required  at  each  time  step  (Berger  1982). 
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Adaptive  time-step  algorithm 


The  adaptive  time-step  algorithm  makes  use  of  appropriate  time  steps  based  on  Courant- 
Friedrichs-Lewy  (CFL)  considerations  to  advance  the  different  levels  (Garcia  et  al  1999).  The 
algorithm  carries  out  operations  that  advance  each  level  independent  of  the  other  levels  in  the 
hierarchy  -  the  exception  being  the  operations  for  the  boundary  conditions.  Finally,  a 
synchronization  of  the  levels  takes  place:  the  fine  grid  is  averaged  onto  the  coarse  grid  and,  the 
difference  in  flux  between  the  coarse  and  fine  grids  (boundary)  is  corrected  (Garcia  et  al  1999). 

Adaptive  mesh  and  algorithm  refinement 

The  adaptive  mesh  and  algorithm  refinement  uses  a  particle  method  to  evaluate  flux  in  regions 
where  microscopic  resolution  is  required  and  a  continuum  method  with  variable  levels  of 
refinement  otherwise  (Garcia  et  al  1999).  The  algorithmic  structure  of  the  adaptive  mesh  and 
algorithm  refinement  method  is  comparable  to  that  of  the  adaptive  mesh  refinement  method. 
The  difference  is  in  the  use  of  a  direct  simulation  Monte  Carlo  calculation  to  evaluate  the  finest 
grid  level.  There  are  four  main  routines  that  address  the  interaction  between  the  continuum 
solver  and  the  direct  simulation  Monte  Carlo  region.  These  routines  i)  pass  the  time- 
interpolated  state  to  the  particle  buffer  cells  (buffer  cells  surround  the  direct  simulation  Monte 
Carlo  region),  ii)  pass  the  momentum  and  energy  corrections  to  the  direct  simulation  Monte 
Carlo  region,  iii)  receive  the  fluxes  stored  when  particles  cross  the  direct  simulation  Monte 
Carlo  interface  and,  iv)  receive  conserved  densities  for  continuum  cells  that  cover  the  direct 
simulation  Monte  Carlo  region  (Garcia  et  al  1999). 

Equation-free  multi-scale  method 

The  equation-free  multi-scale  method  is  designed  for  systems  in  which  macroscopic  evolution 
equations  exist  but  are  not  available  in  a  closed  form.  Modeling  through  macroscopic 
equations,  if  possible,  requires  assumptions  difficult  to  justify.  Instead,  fine-scale  models  are 
initialized  on  short  time  and  small  length  scales  to  accomplish  tasks  at  a  macroscopic  level.  The 
method  comprises  different  techniques  such  as  coarse  projective  integration,  gap-tooth 
scheme  and,  patch  dynamic  (Dada  &  Mendes  2011). 

Coarse  projective  integration 

Microscopic  simulations  are  performed  and  the  solutions  used  to  determine  the  average  values 
of  the  coarse  variables.  These,  in  turn,  are  used  to  compute  the  coarse  time  derivatives 
required  to  extrapolate  the  coarse  variables  over  larger  time  steps.  The  microscopic  simulations 
use  initial  data  that  is  coherent  with  the  present  macro-state. 

Gap-tooth  scheme 

The  idea  is  to  cover  the  space  with  teeth  (small  domains  over  a  short  time  period)  and 
intermediary  gaps  to  approximate  the  evolution  of  a  macroscopic  equation.  The  simulation  of 
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the  microscopic  evolution  is  performed  within  each  tooth.  Boundary  conditions  at  the  edges  of 
each  tooth  need  to  be  specified. 


Patch  dynamic 

Patch  dynamic  is  the  combination  of  the  coarse  projective  integration  with  the  gap-tooth 
scheme. 


5.1.3  Other  Methods 

Multi-scale  agent-based  modeling  method 

Multi-scale  agent-based  modeling  models  the  behavior  of  autonomous  agents  -  individually  or 
collectively  -  and  their  interactions  in  order  to  simulate  their  impact  on  the  overall  system.  It  is 
object-oriented,  rule-based,  discrete  event  and,  discrete  time  (Dada  &  Mendes  2011). 

Complex  automata 

The  method  hinges  on  the  idea  that  systems  can  be  decomposed  into  N  single-scale  cellular 
automata  that  interact  across  spatial  and  temporal  scales.  The  graphical  representation  of  each 
subsystem  and  respective  scales  is  frequently  done  using  a  scale  separation  map.  The  exchange 
of  information  across  subsystems  is  achieved  through  coupling  mechanisms  such  as,  the  sub- 
domain  coupling  and  the  hierarchical-model  coupling.  In  the  former,  different  models  describe 
neighboring  spatial  domains  (possibly  with  different  resolutions),  while  in  the  latter,  some 
parameters  of  the  central  model  are  computed  as  necessary  by  lower  resolution  models 
(Hoekstra  et  al  2007). 

Multi-scale  numerical  scheme 


The  multi-scale  numerical  scheme  aims  at  finding  the  numerical  solution  of  bidomain 
equations.  Bidomain  equations  are  a  system  of  elliptic  partial  differential  equation  and 
parabolic  partial  differential  equation,  coupled  at  each  point  in  time  by  a  system  of  non-linear 
ordinary  differential  equations  (Whiteley  2008).  These  equations  are  frequently  used  to  model 
cardiac  electrophysiology.  The  multi-scale  numerical  algorithm  assumes  that  computation  at  a 
high  resolution  is  resorted  to  a  very  small  number  of  variables  that  change  on  a  short  time-scale 
and  short  length-scale.  A  fine  mesh  is  used  to  approximate  these  rapidly  varying  variables 
whereas  a  coarser  mesh  is  used  to  compute  the  remaining  ones.  When  required,  linear 
interpolation  is  used  to  transfer  the  slower  variables  onto  the  finer  mesh. 

In  situ  adaptive  tabulation  multi-scale  approach 

The  approach  focuses  on  multi-scale  problems  where  a  large  number  of  ordinary  differential 
equations  with  identical  initial  conditions  needs  frequent  evaluation.  Instead  of  solving  for  all 
the  equations,  previously  calculated  solutions  are  stored  and  used  as  approximations  whenever 
a  new  solution  with  similar  initial  conditions  is  needed.  These  approximations  also  satisfy  a 
given  error  tolerance  (Dada  &  Mendes  2011). 
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5.1.4  Discussion  and  conclusions 

While  there  are  different  strategies  to  multi-scale  modeling,  the  key  challenge  is  how  to  couple 
stand-alone  models  (Weinan  et  al  2007): 

•  We  can  match  the  models  using  handshake  regions  or  across  interfaces 

•  We  can  impose  constraints  on  the  micro-scale  model  to  ensure  consistency  with  the 
local  macro  state 

•  We  can  extract  the  macro-scale  data  from  the  micro-scale  simulations 

•  We  can  link  micro-scale  simulations  on  small  boxes  to  mimic  micro-scale  simulations 
over  the  entire  domain 

The  review  of  the  literature  shows  a  great  emphasis  on  the  available  theories,  research,  and 
models  at  the  different  scales  but  not  on  the  coupling  strategy,  which  is  the  aspect  that  is  most 
relevant  to  this  research  effort. 

As  far  as  selecting  the  models  to  be  composed,  some  applications  seem  to  reuse  available 
modelswhile  others  seem  to  use  whatever  models  are  standard  for  that  scale.  Regardless,  most 
of  the  models  have  previously  been  validated.  Unfortunately,  almost  always,  no  explanation  on 
model  selection  is  given.  Particularly  relevant  to  this  effort  is  how  overlap  among  the  various 
models  is  handled.  Not  surprisingly,  partitioning  of  the  domain  space  is  a  way  to  prevent  model 
overlap,  but  overlap  exists  explicitly  in  some  multiscale  methods. 

Of  course,  when  one  creates  a  multi-scale  model  validity  is  an  issue.  The  validity  of  the  stand¬ 
alone  models  does  not  determine  the  validity  of  the  multi-scale  model.  Moreover,  it  is 
important  to  look  for  contradictions  and  incompatibilities  between  the  individual  models.  Some 
models  may  be  based  on  theories  that  are  hundreds  of  years  old  and  have  been  repeatedly 
tested,  while  others  may  be  much  more  recent  and  tentative.  Consequently,  we  observe  that 
experiments  have  been  fundamental  to  validate  multi-scale  models. 

Of  course,  the  critical  question  is  whether  these  multi-scale  modeling  techniques  can  be  used  to 
identify  unintended  consequences.  Through  the  literature,  there  seemed  to  there  seemed  to  be 
an  implicit  assumption  that  multi-scale  modeling  is  the  path  to  scientific  discovery  and 
engineering  design.  However,  the  review  shows  that  the  focus  of  most  applications  tends  to  be 
on  prediction  that  is  validated  based  on  comparisons  to  experimental  results.  Thus,  these  multi¬ 
scale  modeling  techniques  are  effectively  forms  of  interpolation,  and  discovery  is  an  outcome 
of  exploration  and  not  interpolation. 

Our  hypothesis  is  that  a  lack  of  a  systematic  process  for  component  model  selection  is  the 
reason  for  this  outcome.  It  is  the  exploration  of  alternative  model  structures  that  has  the 
potential  to  identify  unintended  consequences.  When  the  model  selection  is  either  ad  hoc  or 
based  on  standards  and  then  "tuned"  to  match  experimental  data,  one  has  effectively  removed 
the  ability  to  generate  alternative  consequences.  Thus,  if  something  unexpected  is  going  to 
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occur,  is  going  to  occur  in  the  experiment.  The  reason  is  if  the  model  deviates  from  the 
experiment,  it  is  adjusted  to  match  the  experiment. 

It  is  also  worth  noting  that  when  we  originally  considered  this  topic  in  engineering,  one 
approach  that  seemed  relevant  was  multi-disciplinary  optimization  (MDO).  In  fact,  some  of  this 
literature  was  discussed  in  the  RT-138  final  report  (Pennock  et  al  2016).  However,  it  was 
realized  during  the  course  of  the  investigation  that  MDO  is  a  special  case  of  the  multi-modeling 
problem  where  any  overlap  issues  among  the  models  are  tolerable.  In  the  language  that  will  be 
developed  in  Section  6,  there  are  either  no  or  one-way  transition  linkages  among  the  models. 
Consequently,  while  these  techniques  are  important  to  engineering  in  general,  they  are  less 
informative  for  addressing  the  challenges  examined  in  this  study. 


5.2  Finding  Unintended  Consequences  in  the  Social  Sciences 

While  there  are  many  possible  starting  points  within  social  science,  it  is  useful  to  start  with 
measure  theory.  We  will  start  the  discussion  by  expanding  upon  the  prominent  modeling 
approaches  within  psychometrics,  econometrics,  and  to  the  extent  possible  sociology.  Measure 
theory  paradigms  have  sub-domains  and  categories,  but  we  will  use  latent  variable  theory  (LVT) 
as  it  has  been  identified  as  useful  for  cognitive  behavioral  phenomena.  But  it  is  useful  to  know, 
that  in  congruence  to  much  social  theory,  that  the  paradigm  is  embedded  in  other  model 
paradigms:  random  control  trial,  machine  learning  techniques,  and  dynamical  equilibrium  to 
name  some  prominent  ones.  Seeing  as  there  is  no  'conscious'-meter  or  'economicus'-meter 
per  se,  social  measure  has  to  constantly  assume  at  least  the  possibility  of  latent  effects  and 
their  effects  on  potential  model  transformations. 

We  review  this  with  two  minds  within  socio-technical  enterprises.  First  being  the  establishment 
of  latent  or  relative  model  phenomena  in  enterprises  establish  methods  from  previous 
research.  The  walkthrough  below  borrows  the  psychometric  perspective  as  it  is  presumed  this 
is  more  useful  to  make  the  observable  points,  so  this  is  then  rooted  in  item  response  theory 
language  and  related  formal  stances  viewing  'technical'  aspects  as  'items'  toward  which 
cognitive  actors  respond.  It  is  also  where  the  root  mathematical  measure  theory  developed,  so 
has  embedded  the  theoretic  motivations  presumably.  Also  concerning  enterprises,  the 
individual  (or  small  group)  behavior  is  often  where  the  model  difficulties  occur;  for  instance 
reviewed  in  the  previous  RT  was  that  higher-order  effects  occurred  when  individual  drivers 
acted  on  a  conversing  utility  curve  can  be  viewed  as  'latent'.  So  then  of  second  mind,  is  to 
consider  how  more  generally  social  sciences  differ  in  their  philosophy  of  science  that  would  be 
relevant  to  enterprise  systems  engineers. 

Considered  in  the  methodology,  it  is  hoped  that  the  audience  has  an  appreciation  on  (linear) 
algebra  and  normative  statistics.  The  measure  theory  portion  is  based  in  these  maths  which 
should  not  be  new  to  most  modelers,  but  the  observed  hypothesis  is  that  there  are  unique 
analytic  considerations  and  spatial  reasoning  than  what  is  traditionally  covered.  The  review 
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then  lists  specific  methods  where  could  be  found,  and  instead  focuses  on  the  theory  and 
intuition  underlying  the  modeling  and  specifically  the  approach  to  error  analysis.  We  refer 
those  interested  in  specific  model  classes  or  exampled  uses  to  the  cited  works  provided  and 
have  referred  works  that  inundate  the  reader  with  the  most  broadly  referenced  works  found. 
So  to  say,  that  the  references  are  not  central  in  the  sense  of  seminal  but  rather  of  importance 
by  establishing  theory  or  being  a  meta-review  with  referenced  works  themselves. 

One  impression  is  that  much  of  the  methods  are  not  entirely  different  in  underlying 
mathematics,  but  as  will  be  discussed,  establishing  observed  items  as  a  model  within  the  social 
science  purview  is  often  the  crux  of  the  analysis.  As  such  the  'language  of  the  social  science'  is 
important  in  maintaining  the  current  referent  between  model  and  phenomena  and  often 
presented  in  dualistic,  dialectic  considerations.  Social  science  cannot  always  take  for  granted 
that  symmetry  between  model  objects  and  their  objects  of  study.  So  those  observations  based 
in  mathematical  description  will  try  to  be  faithful  to  impressed  systems  language  use  while 
using  encountered  terminology  within  social  science  for  those  observations  more  firmly  rooted 
there.  This  said  on  strong  occasion  many  terms  will  be  synonymous  however  with  the 
complexity  implicit  in  psycho-social  elements  noticing  the  creation  of  differences  should  be 
most  attentive. 

Of  last  note,  the  goal  of  the  overview  is  reaching  the  model  groupings  and  observations  on 
theory  use  to  both  inform  system  practitioners  and  create  descriptive  constructs  to  relate  the 
breadth  of  literature.  The  write-up  sectioning  then  covers  LVT  classes  by  those  given  by  theory 
development  and  impressed  current  usage.  Yet  take  attention  that  these  classes  are  not  per  se 
consistent  across  social  theory  sub-disciplines.  For  instance,  the  'machine  learning'  paradigm 
uses  statistical  measures  and  linear  algebraic  manipulations,  but  one  might  think  to  class  their 
usages  differently.  The  classes  here  are  those  from  the  Psychometric  Society  and  categorical 
review  papers  with  some  applicable  measure  examples  in  Econometrica.  (For  additional 
background  see  (Epstein  &  Zhang  2001;  Ploberger  &  Phillips  2003;  Angeletos  &  Pavan  2007; 
Matsuyama  2004;  Giraud  2014).)  In  subsequent  sections,  these  classes  are  covered  more 
broadly.  Flowever  here  is  to  explain  within  one  theory  area  how  the  epistemic  considerations 
develop  and  that  the  epistemic  considerations  do  appear  to  carry  across  modeling  efforts. 


5.2.1  Starting  Model  Set-Up:  Item  Response  and  the  Single  Factor  Model 

A  fundamental  beginning  point  for  creating  a  variety  of  complex  measures  is  to  explore  wanting 
a  simple  objective  explanation  with  which  to  begin  future  observations;  a  'kernel'  if  one  wills. 
Factor  modeling  is  often  traced  to  Spearman  and  the  venerable  intelligence  factor  (intelligence 
quotient,  IQ)  as  anyone  who  has  taken  standardized  testing  measures  will  be  familiar.  In 
modern  terms,  the  motivation  for  an  "intelligence"  factor  might  be  better  termed  as  a  human 
ability  for  "anti-entropic"  mental  capacity,  and  then  quotient  is  trying  to  find  potential  quotient 
relationships  amongst  a  group.  Fluman  abilities  generally  had  been  explored  within  psychology, 
yet  there  is  conflicting  methods  problem  when  measuring  human  manifested  phenomena  that 
must  be  crossed.  Comparing  to  a  physical  system,  one  would  like  to  create  something  similar  to 
a  thermodynamics  measure  by  observing  directly  a  variant  created  by  the  system  (e.g.  a 
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thermometer  for  thermodynamic  temperature),  and  use  statistical  inference  to  obtain 
objective  descriptions. 

The  question  became  how  does  one  do  this  for  a  'human  measure  method'  with  enough 
empiricism  to  replicate  measures  similar  from  thermodynamics  ->  thermometer.  Human 
phenomena  was  (and  perhaps  still  is)  not  objectively  known  enough  to  establish  what  could  be 
equivalent  of  a  'intelligence-meter';  even  though  at  the  time  an  intelligence  phenomena  was 
established  qualitatively  (Cudeck  &  MacCallum  2012).  Particularly  then  humans  exhibit  means 
for  discerning  other's  traits  as  we  are  conscious  beings,  but  then  one  loses  the  objectivity  nicely 
given  by  physical  meters.  Hence  a  seeming  inconsistency  in  psycho-social  measurement  that 
ideal  symmetries  then  trade  with  objectivity  both  impacting  the  ability  for  model  inference. 

There  is  then  an  ongoing  abstraction  to  split  at  the  beginning  as  one  needs  a  qualitative 
description  but  lacks  a  sufficient  analytical  means  to  objectify  this  linguistic.  Then  the  question 
from  Spearman  became  to  reason  how  does  one  create  means  to  split  this  difference  for 
particular  factoring  when  the  factor  is  inherently  what  is  now  termed  'latent'  or  'not  directly 
measurable'.  As  Bartholomew  notes,  the  original  paper  spends  only  an  appendix  briefing  on 
factor  modeling  and  the  rest  was  determining  epistemically  how  a  "General  Intelligence" 
description  can  be  gained  from  "[generalizing]  Sensory  Discrimination"  (Bartholomew  1995).  In 
Bartholomew's  words  describing  Spearman, 

"Of  more  immediate  relevance  to  factor  analysis,  [Spearman]  states 
what  he  calls  "our  general  theorem",  which  is  "whenever  branches  of 
[cognitive]  activity  are  at  all  dissimilar,  then  their  correlations  with 
one  another  appear  wholly  due  to  their  being  all  variants  wholly 
saturated  with  some  common  fundamental  function  (or  group  of 
function)".  He  distinguishes  this  central  Function  from  "specific 
factors  seem  in  every  instance  new  and  wholly  different  from  that  in 

all  others". 

The  impression  to  modelers  is  to  first  consider  the  objects  involved  and  how  one  can  rely  on 
reasonable  theorems  to  relate  to  our  sensory  experience.  Certainly  one  can  claim  any 
experience  is  then  within  purview,  but  amidst  this  purview  can  be  a  subjective  experience  and 
thus  induces  possible  subjectively  created  phenomena:  e.g.  placebo  effects,  self-fulfilling 
prophecies,  unintended  consequences,  etc.  This  quickly  leads  into  theory  on  causality  which 
while  applicable  is  both  outside  purview  and  specifically  what  Spearman  and  modelers  would 
like  to  avoid.  Spearman's  observation  was  congruent  to  system  modelers'  intention  in  wanting 
orthogonal  measures  (his  "branches  on  activity").  The  reasoning  then  is  that  if  these 
objectively  orthogonal  measures  are  then  "saturated"  with  covariates  then  one  can  reason  that 
the  remaining  variant  factors  is  a  wholly  'human'  or  'social  theoretic'.  This  should  prompt 
modelers  that  measuring  within  the  social  science  starts  from  establishing  a  reasonable 
ontology  but  additionally  choosing  objects  such  that  one  allows  this  'saturation  effect'  to  then 
map  to  the  'psycho-social'  phenomena  of  interest:  not  just  in  choice  objects  but  couching  those 
objects  within  epistemic  type  (sensory  item,  cognitive  item,  latent  personality,  etc.)  as 
Spearman  notes. 
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Taking  the  standardized  testing  example,  many  students  are  familiar  with  the  itemized  set  up. 
Individual  items  with  bounded  answer  set  defines  a  measurement  analysis  domain  (i.e.  an 
answer  sheet)  paired  with  an  individual  amongst  a  group.  This  then  provides  a  mapping  of  an 
individual  or  individual  group  from  their  responses  to  a  defined  objective  measure;  the  initial 
insight  being  creation  on  what  now  recognized  as  a  measure  set  (Lebesgue  or  Borel)  per 
individual  and  by  inverse  a  measure  of  items  per  individuals.  Now  a  trivial  difference  measure 
would  be  to  take  an  individual  response  compared  to  the  ideal  answer  set  (grading)  to  create  a 
numerical  value  to  each  individual  (score)  then  rank  order  individuals  by  comparison  to 
bounded  axiomatic  set  of  numbers  (students  along  0-100  point  value). 

While  this  serves  as  an  objective  inference  on  answering,  this  falls  to  inconsistency  as  the 
'answer  set'  is  its  own  item  response  from  other  individuals  thus  being  a  subjective 
determination  on  the  measure.  The  question  posed  by  Spearman  was  that  considering  latent 
effects  such  as  cognition  or  personality  how  does  one  reasonably  map  the  objective  score 
responses  to  human  phenomena  that  appear  to  be  on  'higher-order'  effects,  what 
mathematically  are  ordinal  phenomena.  Particularly  how  to  do  this  given  what  we  would  now 
know  as  things  such  as  priming,  environment,  and  general  bias  that  others  embed  within  the 
'measuring  device'.  There  is  then  a  whole  discipline  of  study  that  examines  the  effects  within 
different  experimental  set-ups  (i.e.  measuring  device  configuration)  termed  experimental 
theory  and  experimental  design  (Shadish  et  al  2001).  For  architecture  and  design  disciplines, 
these  later  are  suggested  as  primers. 

The  main  contribution  that  then  spawned  (latent)  factor  analysis  in  the  social  sciences  was  to 
instead  think  to  use  regression  on  the  measures  then  analyze  the  correlational  space  for 
difference  measures.  In  our  testing,  this  would  mean  regressing  the  answers  across  multiple 
statistical  moments:  individual  scoring,  individual  scores  across  items,  and  items  scores  across 
tests.  And  in  doing  so  gain  'measure  sets'  again  from  their  correlation  matrices.  As  noted  in 
the  previous  study  (RT-138),  the  examination  on  different  ordinals  and  statistical  moments 
were  a  modeling  basis  for  much  of  the  'warning  signal'  literature  (Pennock  et  al  2016),  so  the 
basis  on  LVT  has  the  same  shared  intuition. 

The  argument  then  is  that  unique  internalized  factors  such  as  intelligence  would  show 
themselves  over  the  course  in  changes  to  normalized  responses.  Defining  a  human  factor  as  an 
objective  ability  answered  less  by  the  question  "How  does  this  factor  present  directly  from 
measure?"  but  "How  does  this  factor  present  itself  nonrandomly  over  iteration  of  a  measure?". 
If  one  is  familiar  with  general  intelligence  model  (g-theory),  this  is  a  one  factor  model  ("across 
item"  intelligence)  from  predefined  item  response  (intelligence  test). 

Now  inferring  the  measurement  can  be  seen  through  the  partial  correlation  coefficients  (Yule 
1897).  One  can  recognize  the  distance  measure  form  from  variables  taken  from  a  matrix  space 
on  the  computed  partial  correlations.  Analyzing  the  correlation  coefficients  can  show  ordering 
patterns  across  moment  coefficients  within  linear  algebraic  representation  compared  to  a 
hypothesized  general  factor.  This  then  allows  an  objective  basis  to  examine  more  ordinal 
responses  presuming  these  are  expressed  across  statistical  moments. 
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Identifying  that  there  are  non-random  profiles  within  the  partial  correlations  in  this  case 
between  individual  answers  is  then  the  goal.  Of  interesting  support,  there  was  a  competing 
model  using  sampling  modeling  which  may  be  more  familiar  to  those  within  an  informationalist 
background  (Thomson  1950).  Even  as  the  underlying  sampling  model  has  better 
correspondence  to  modern  brain  functioning  (Mackintosh  1998)  (what  most  might  recognize  as 
a  Bayesian  updating),  Bartholomew  points  out  that  to  identify  the  higher  order  effects  to  break 
our  'inconsistency  psychometric  problem'  using  statistical  moments  is  again  necessary  and  then 
congruent  to  the  Spearman  model.  Then  even  across  first  order  response  models,  this 
'intelligence  factor'  is  identified  by  non-standard  jumps  in  response  profiles  and  recognized  by 
the  ability  to  intake  and  respond  at  a  statistically  higher  ordinal;  being  able  to  read,  respond, 
iterate  over  the  course  of  several  items,  and  observable  across  different  first  order  models. 
Then  when  seeking  to  make  a  single  factor  model,  the  factor  variable  (IQ)  is  composed  with  the 
correlation  variables  rather  than  determining  the  direct  test  score  model  which  helps  resolve 
the  subjectivity  inference  problem  on  the  item  choice  and  inference  basis.  From  a  modeler's 
perspective,  this  would  seem  to  be  an  inconsistency  choice  problem  on  the  model  basis.  In  fact 
by  the  theorem,  the  basis  algebra  is  relatively  irrelevant  comparatively  as  it  is  constructed 
irrespective  of  first  order  model  choice  (still  ultimately  important  as  the  observable  measures 
still  need  mapping  to  a  basis  algebra).  As  any  choice  of  a  particular  model  basis  still  must  be 
broken  into  an  ordinal  spacing,  the  kernel  of  the  analysis  is  the  covariate  profile. 

From  this  the  general  LVT  form  can  be  presented.  Below  the  basis  for  the  model  estimation 
which  identify  general  variable  objects.  This  is  then  presented  in  a  matrix  field  over  the 
covariate-correlation  profiles;  presented  in  expected  value  form.  While  additional  assumptions 
are  needed  for  an  analytic  solution,  once  can  get  a  general  algebraic  appreciation  from 
Moustaki  et  al  (2015): 

%i  —  T-i  +  +  8 1 

x  —  'observed  variable' 

^  =  'common  variable'  (ie  latent  factor) 

A  —  'factor  loading 
8  —  'unique  factor' 
r  =  'constant  factor'  (if  needed) 

Var(Xi) 

£0  =  Cov(xi+1,Xi)  Var(xL+1)  ...  =  "Covariate  Matrix" 

=  Cov(xi+2,x+1) 

Var(Xi)  =  Cov(Xi,Xi)  =  E[(xt  -  E(xt ))  ] 
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Cov(xi+1,Xi)  =  E(xi,xi+1)  -  E(Xi)E(xi+1) 


Now  to  close  this  first  section,  the  use  of  strict  assumptions  for  analytic  solution  and  then 
statistical  inference  is  a  needed  constraint.  Specific  assumptions  and  convergence  estimations 
are  covered  in  later  sections,  yet  the  generalized  incompleteness  is  important  to  emphasize. 
Involving  ordinal  space  and  then  numbering,  opens  oneself  to  an  incomputable  space 
potentially  as  the  first  order  measures  are  presumably  cardinality  on  the  rationals  or  greater, 
but  opening  up  ordinality  puts  one  in  the  cardinality  at  least  on  the  reals.  While  smaller 
factoring  can  be  negligible,  the  IQ  'full  factor  space'  (e.g.  environment,  personal  affect,  physical 
health  etc.)  is  then  cardinality  greater  than  the  direct  item  response,  so  one  consistently  deals 
with  that  the  factoring  function  is  strictly  injective  even  if  found.  This  mathematically  then 
speaks  to  limits  by  defining  a  set  space  over  a  space  with  an  ordinally  larger  size,  but  highlights 
generally  the  difficulty  in  modeling  within  social  theory  given  applicable  algebraic  'collapsing'  of 
factors.  The  'field  of  factors'  that  could  be  effecting  a  particular  grouping  has  to  deal  with 
several  potential  moments.  Now  these  have  reasonable  solutions,  but  as  one  can  analyze  the 
basic  model  and  determine  inherently  the  size  must  be  analytically  reduced  for  any  particular 
modality. 

However  as  LVT  makes  the  theoretical  assumption  from  the  'Spearman  assumption'  that  if  one 
does  find  a  particular  solution  this  then  assumed  that  this  is  an  'anti-entropic  effect'.  One  then 
attributes  this  simplification  to  a  compressing  statement  with  the  'psycho-social  space' 
however  general  that  may  be.  This  has  been  observed  in  IQ  as  being  a  statement  on  class 
factoring  from  the  projection  to  the  deviation  profile,  so  IQ  is  then  represented  as  profile 
shaping;  'general  ability  that  shapes  group  responses'  using  normalized  statistical  curves;  hence 
why  IQ  defines  quotient  relationships.  But  these  then  are  statements  on  random  profile  and  at 
least  theoretically  then  cannot  by  a  person  by  person  theoretic  order.  So  there  is  a  general 
statement  on  the  factor  space  but  not  bijectivity  to  the  set  response  items  or  set  individuals. 
Although  a  relationship  on  the  set  of  individuals,  this  measure  would  then  be  better  described 
as  a  class  ordering  than  as  direct  set  measure.  This  seems  descriptive  toward  the  observation 
that  IQ  factors  as  it  has  been  better  predictive  across  groups  of  individuals  (interpersonal 
factoring)  than  individual  by  individuals  (intrapersonal  factoring).  Note  we  have  talk  ourselves 
into  existence  ordinal  and  topological  theories  in  this  'social  space'  hence  why  there  should  be 
a  strong  sense  of  'openness'  and  'alternate  orderings'. 


5.2.2  Item  Response  with  Multiple  Dimension  Factoring 

Assuming  then  that  one  has  a  singular  factor  description,  one  would  want  to  refine  this 
description  particularly  as  first  order  models  seem  independent  from  ordinal  factors  and 
commutativity  had  limited  resolution.  In  the  example  of  IQtesting,  there  can  be  a  multitude  of 
factors  influencing  both  the  direct  measurement  (question  choices,  testing  environment,  item 
types)  and  with  latent  factors  (individual  motivation,  personal  differences,  familiarity  with 
testing).  So  a  natural  progression  is  to  parse  refining  factors  out,  and  increasing  the  cardinality 
on  set  observables  then  hopefully  could  lead  to  an  approach  that  analytically  refines  better 
theoretic  decompositions. 
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The  underlying  linear  statistical  analyses  should  be  familiar  to  most  statistics  practitioners. 
However  as  the  algebraic  position  poses  potential  problems  any  statistical  analysis  must  come 
with  simplifying  assumptions,  or  accept  a  degree  of  incompleteness.  Latent  factor  analysts  use 
several  modalities  to  categorize  these  split  representations.  Useful  for  this  section  are  the 
concepts  of  measured,  manifest,  and  latent  as  these  are  the  central  variable  types  used  during 
refinement. 

For  the  IQ  example,  measured  variables  would  correspond  to  environmental  factors  as  well  as 
the  direct  test  scoring.  Then  the  manifested  variables  are  the  item  responses  themselves  by 
the  individuals  and  the  hypothesized  latent  variables  are  the  underlying  phenomena 
description.  So  upon  observation,  one  might  notice  individual  test  scores  correspond  to  say 
that  on  each  fifth  item  row  there  are  changes  in  correlation  between  individual  groups,  a 
'manifested  phenomena'.  If  one  can  then  reasonably  match  these  'manifested  response'  to  say 
a  pre-scripted  'mathematical  reasoning'  item  set  types  in  the  measurement  variables,  then  one 
can  support  an  empirical  stance  that  this  represents  some  more  specific  latent  phenomena 
being  manifested;  expression  of  a  specific  trait  around  question  type  that  then  maps  into  the 
latent  variable  space.  This  also  works  conversely;  say  the  manifested  response  occurs  because 
of  question  priming  hence  a  'measurement  phenomena'.  So  this  adds  a  useful  algebraic 
simplification  either  way  as  one  can  assume  a  'bijective  section'  between  observations  and 
theoretic  statement  (observation  ->  manifest  ->  sub-group  of  latent).  Then  with  assumption 
infer  that  this  corresponds  to  some  latent  sub-factoring  on  the  more  general  factor,  or  sub¬ 
factor  giving  correlational  space  more  power.  If  one  is  familiar  with  specific  intelligences,  these 
manifested  correlational  profiles  are  then  found  support  for  specific  intelligence  groupings. 

As  one  may  start  to  reason,  finding  a  sufficient  empirical  solution  for  assigning  manifested 
dimensions  against  supporting  generalized  latent  factors  is  difficult  to  say  the  least.  In  the  one 
factor  case,  modeling  at  the  descriptive  correlative  level  left  with  an  ordering  inference  but 
limited  refinement  potential,  and  while  adding  measuring  variables  can  be  shown  to  add  in 
statistical  significance  in  identifying  possible  manifested  effects,  the  tracing  then  back  to  the 
latent  space  which  now  has  added  dimensions  is  difficult.  An  ongoing  question  is  then  how 
much  does  one  refine  manifest  dimensions  opposed  to  generalized  latent  factors  against  the 
descriptive  potential  verse  convergence. 

Additionally  useful  classes  help  in  these  cases,  amongst  which  are  inclusion  of  endogenous  and 
exogenous  variables  to  describe  the  epistemic  split  between  manifest  and  measure  as 
endogenous  and  exogenous  are  assumed  to  capture  manifest  and  generalized  measure 
variables  respectively.  Here  Jorgenson  gives  a  good  generalized  linear  structural  model  form 
that  shows  typical  variable  typing  considered  for  analysis  (Jorgenson  1978): 
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Latent  Factor  (Manifested) 
Latent  Covariate  (General) 
Measured  Variable 

Covariate  (Measured  Factor) 
Functional  Link 


1978) 

The  use  on  a  priori  typing  is  that  this  reduces  the  presumed  analysis.  If  for  example  one  has  an 
exogenous  variable,  one  can  assume  that  this  is  not  determinate  from  the  latent  phenomena 
within  an  individual.  It  still  needs  to  be  modeled  but  can  serve  as  a  basis  on  either  a  direct 
quotient  on  the  measurement  set  or  a  functional  description  on  the  specific  covariation  factor. 
This  then  gives  more  granularity  on  the  measurement  space  providing  more  precision  when 
performing  the  latent  factor  analysis.  The  determination  on  variable  typing  and  configuration 
come  from  experimental  and  epistemological  consideration  (e.g.  psychometrics  presume 
different  use  on  exogenous  variable  compared  to  econometrics).  However  pre-typing  the 
analysis  can  yield  more  precision  within  the  latent  space  or  rather  more  elimination  on 
potential  moments,  but  this  involves  some  architecture  in  the  design.  So  independence  on 
types  against  orderings  has  to  be  designed  and  given  its  own  alternate  ordering. 

With  sufficient  analysis  and  computation,  various  measurement  observables  can  be  used  to 
converge  to  a  solution,  but  this  depends  on  what  form  the  latent  effect  takes:  is  it  a  specific 
factor  or  is  it  a  correlative  term  between  specific  factors?  There  were  no  universal  criteria  or 
procedures  for  identification.  However  there  are  common  estimation  tools  that  should  be 
familiar.  Listed  below  are  a  sampling  found  with  typing  provided  by  (Bartholomew  et  al  2002) 

Letv  =  (r,  A,  0,  0)  be  the  vector  containing  all  model  parameters 

Maximum  Likelihood 

Fml  =  ln|I(v)l  +  tr(SI]_1(»)  -  ln|S|  -  p 
Unweighted  Least  Squares 
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Generalized  Least  Squares 


Puls 


1 

=  - tr(S-I(v )) 


2 


FGLS=\tr(l-  S-'Y. O))2 

The  basic  estimation  methodology  is  congruent  to  most  exploratory  analysis.  Minimizing 
difference  between  fit  model  against  the  sampling,  and  the  only  addition  with  LVT  is  to  include 
the  covariate  matrix  and  then  the  choice  'distance  measure'.  It  is  noted  amongst  review  papers 
that  for  normal,  probabilistic  items  are  commonly  used  and  maximum  likelihood  is  then  usually 
expected  (Joreskog  &  Moustaki  2001).  Additionally  methods  such  as  rotations  and  clustering 
are  used,  but  given  that  the  latent  space  is  of  importance  and  data  manipulation  directly 
influences  these,  the  manipulation  should  either  be  kept  to  a  minimum  or  strongly  justified. 

Another  useful  notion  is  using  a  model  which  is  known  to  have  a  unique  solution  (unique 
matched  with  covariant  matrix)  in  which  case  it  is  called  'identified'.  Otherwise  an  exploratory 
model-setup  is  said  to  be  'under-identified'.  The  heuristic  methodology  is  to  examine  what  is 
usually  an  under-identified  model  and  add  constraints  where  justifiable  until  the  model  can  be 
identified.  As  this  can  make  for  arbitrary  model  encapsulation,  this  general  approach  is  termed 
Exploratory  Factor  Analysis  (EFA);  sometimes  principal  component  analysis  is  used 
synonymously  as  one  is  'exploring  for  the  principal  factors,'  but  we  will  define  the  difference  in 
later  sections.  Generalized  algebraic  unknowns  useful  are  on  potential  observable  latent 
objects  (Bollen  1989)  and  varieties  on  model  parameters  and  degrees  of  freedom 
(Bartholomew  et  al  2002). 

From  the  criteria,  one  can  take  away  good  guiding  criteria  for  simple  informational  cases  gained 
from  psycho-social  measure.  There  is  a  dualistic  meta-constraint  as  discussed  by  treatment  of 
measurable  variables  against  the  change  in  observable,  identifiable  latent  space.  This  is  just  for 
identifying  the  analytic  model  let  alone  commutative  against  the  greater  state  space  as 
identified  in  a  categorical  model  sense. 

So  then  there  are  identifiable  typing  on  the  model  viability  depending  on  the  latent  analysis 
alone.  The  compression  involved  in  estimating  latent  variables  involves  both  inclusion  on 
[measurable-manifest]  observables  and  additions  in  constraint  on  the  [manifested-latent] 
factors.  While  ideally  an  exploratory  model  fits  solvability  criteria  and  thus  a  bijective  mapping, 
LVT  assumes  generally  this  is  not  the  case,  and  easily  identifiable  behavior  models  seem  to  be 
exceptions  rather  than  the  case  rules.  There  are  other  categorical  methods  for  splitting  this 
however  for  multi-dimensional  real  space  these  rules  seem  inescapable;  less  to  say  that  human 
behavior  is  a  complex  factoring.  Provided  then  is  a  functors  diagram  describing  the  measure 
limits  described  in  the  Rosen  measure  categories  (Rosen  1978). 
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System,  j  item  coset,  observable 

LVT  Vector,  (X  LVT  Functor 

N-tb  order  vector,  F(f)  reciprocal 

Latent  Variable  Model,  ]\/Jr  Model  Set 
Homomorphic  Functor 
Epistemic  Correspondence 
Epistemic  Compression 


Figure  6:  Latent  Variable  Category  Diagram 


5.2.3  Structural  Modeling  and  Confirming  Factors 

Now  as  noted  previously,  there  are  observable  structure(s)  that  can  be  identified  as  this  is  a 
relatable  social  space.  For  instance,  the  localization  of  'mathematical  reasoning'  would  want  to 
be  tested  as  a  hypothesis,  but  as  one  has  to  assume  an  open  setting  for  latent  structure  in  EFA, 
one  can  then  just  set  these  as  belonging  to  a  'known,  latent  structure'  (i.e.  pre-scripted 
mathematical  questions  should  not  share  factors  with  say  linguistic  questions).  This  provides 
additional  constraint  which  allows  effectively  greater  solvability  by  estimation  addition.  This  is 
then  termed  Confirmatory  Factor  Analysis  as  one  is  trying  to  'confirm'  a  particular  model 
structure. 

This  becomes  an  inherent  simplification  which  might  seem  obvious  to  most.  But  as  discussed 
the  bijectivity  of  humans  to  measure  shows  that  constraints  can  be  large  assumptions 
potentially  missing  latent  phenomena  that  at  least  as  a  'warning  signal'  is  against  the  purpose. 
LVT  distinguishes  between  these  as  while  EFA  and  CFA  are  even  blurred  in  practice,  it  still 
signals  a  categorically  different  theoretic  approach.  More  common  is  to  use  in  parallel:  EFA  to 
identify  negligible  elements,  CFA  to  show  continued  good  fit,  EFA  again  to  expand  to  new 
factors,  CFA  to  retest  these  hypotheses,  etc.  Quite  synonymous  with  exploratory  and 
explanatory  modeling,  yet  as  discussed,  EFA  and  CFA  can  have  an  additional  layer  of  theorem 
given  the  ordinal  space  induced  by  social  theory. 

For  CFA,  the  starting  point  is  a  hypothesized,  set  model  schema  rather  than  a  measurable  set. 
The  goal  then  is  to  'confirm'  this  hypothesized  model  schema  is  inductively  correct  using 
analysis  on  'fit  statistics'.  Moving  to  a  different  basis  for  analysis,  it  is  then  useful  to  categorize 
the  schematic  models  as  often  one  is  trying  to  (un)validate  a  multitude  of  models.  Here  visual 
aids  and  graphical  notation  is  often  used  to  show  the  full  model  structure.  As  an  interesting 
side  note,  the  initial  schema  is  often  elicited  from  community  members,  and  is  then  aligned  to 
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model  elicitation  practices,  so  this  may  be  an  interesting  overlap  to  systems  modeling.  As 
(Hersberger  1994)  notes,  the  use  on  graphical  notation  can  vary.  However,  they  tend  to 
communicate  equivalent  structure  within  LVT. 

Once  there  is  a  set-model  structure,  a  model  then  gains  the  description  of  being  a  structural 
equation  model  (SEM).  This  then  presents  a  paradigm  to  simulate  the  set  model,  and  then  the 
goal  is  to  gain  a  variety  of  fit  statistics  against  a  data  set;  or  more  ideally  a  class  of  data  sets.  As 
might  be  intuitively  realized,  one  inherently  assumes  that  then  a  model  is  identifiable  (or  semi- 
identifiable  under  probabilistic  models).  Then  CFA's  are  subject  to  necessary  (but  not 
necessarily  sufficient  conditions)  from  Bollen  (1989): 

t  <  0.5  p(p  +  1) 

t  —  numberof  estimated  parameters' ,p  =  'indicators 

•  Scale  of  Latent  Variables  must  be  set  by  either 
o  Fixing  factor  variance  (Ota) 
o  Fixing  factor  loading  (Lambda) 


Then  one  obviously  makes  an  inductive  case  using  a  variety  of  fit  statistics  to  claim  that  a 
hypothetical  SEM  is  valid.  The  validity  and  relative  strength  on  statistics  seems  its  own  area  of 
inquiry,  and  there  did  not  seem  an  apparent  universal  criterion.  However  there  did  seem  to  be 
two  major  classes  based  either  derivation  of  a  chi-square  statistic  or  an  information  theoretic 
basis  (i.e.  information  criteria). 

These  have  been  implemented  in  various  platforms 

•  LISREL 

•  Amos 

•  EQS 

•  MPlus 

•  GLLAMM 

•  Stata 

•  R  packages:  Lavaan,  Psychometrics 

Now  here  there  is  useful  descriptive  difference  between  PCA  and  CFA.  In  measurement 
environments,  the  full  space  of  potential  model  structures  and/or  schemas  can  be  immensely 
complex  to  a  numerical  space.  However  some  environments  are  clearly  more  constrained  than 
others.  CFA  generally  is  useful  in  this  area  because  upon  high-level  analysis  can  help  one  map 
to  assumptions  based  on  context:  a  pilot  in  flight  offers  a  much  stronger  linear  basis  compared 
to  personality  over  food  choices.  More  ordinally  constrained  environments  then  are  more 
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easily  fit  to  identifiable  models,  and  then  one  rather  'monitors'  that  the  axiomatic  principals  to 
a  model  are  not  being  violated;  one  monitors  the  'principal  components'  to  a  system  against 
those  on  an  identifiable  model.  Otherwise  one  is  said  to  be  'confirming  factors'  in  an  otherwise 
unidentifiable  system.  Although  just  as  one  switches  between  EFA  and  CFA,  PCA  is  often 
analogous/synonymous  in  practice. 

The  PCA  paradigm  seems  closest  to  direct  usage  with  in  monitoring  for  'higher  order'  effects.  A 
recent  example  is  the  creation  of  information  theoretic  models  for  stock  market  behavior.  The 
first-order  model  matches  the  units  to  the  Brownian  motion  object  from  the  measure  basis 
(Bollerslev  &  Mikkelsen  1996).  Then  there  are  general  theorems  that  present  useful  classes 
within  the  system  explaining  the  possible  information  states  that  are  statistically  identifiable: 
'long-memory',  'rough  stochastic',  and  'non-memoried'  (Fouque  et  al  2000;  Chronopoulou  & 
Viens  2012).  Then  what  financial  engineers  attempt  is  to  create  'warning  signals'  that  instead 
monitor  that  the  principal  assumptions  within  a  class  of  model  is  being  upheld  or  can 
reasonably  be  confirmed.  'Long-memory'  implies  there  is  reliable  hysteresis,  so  monitoring  just 
confirms  that  particular  components  are  fulfilled  and  then  allow  for  identifiable  models 
(congruent  to  PCA).  'Rough  stochastic'  allows  for  capturable  algebraic  assumptions  as  there  is  a 
'mean  reverting'  portion.  But  there  is  not  enough  for  full  identifiability,  so  one  confirms  that  a 
particular  model  is  viable  (congruent  to  CFA).  'Non-memoried'  (i.e.  non-dependent,  non- 
memoried)  then  allows  for  little  tractable  solutions,  so  general  exploration  is  the  only  option 
(congruent  to  EFA)  (Chronopoulou  2016).  Then  for  modeling  simulation  purposes,  one  assumes 
that  the  class  change  maps  to  a  representable  state  change  in  the  system,  and  one  has  a 
reasonable  if  not  abstract  control  schema.  Although  the  algebraic  rules  that  come  with 
stochastic  systems  makes  the  identification  of  models  different  from  traditional  LVT,  there  is 
consistency  in  the  analytic  intuition,  and  it  is  our  hypothesis  that  this  due  to  the  (latent)  ordinal 
structuring. 

While  identifiable  factor  models  are  ideal,  generally  one  has  to  explore  variety  of  models  as 
well  as  ordinals.  There  are  common  model  assumptions  that  appear  that  are  useful  and 
considered  well  justified  if  not  well  identified  (Moustaki  et  al  2015): 

•  Setting  on  latent  factor  loading  to  values 

•  Setting  or  constraining  on  error  variances 

•  Interchanging  error  variance  with  correlated  specific  factors 

•  Specifying  covariance  on  factors 

•  Scaling  the  latent  variable 


So  the  uniqueness  of  LVT  methodology  is  in  doing  a  'two-sidedness'  to  the  analysis  as  one 
explores  modal  structure  and  algebraic  schema.  Considering  again  the  IQ  item  measurement, 
as  one  repeats  the  measure  one  would  like  to  use  this  as  a  predictive  simulation.  However 
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given  that  there  may  be  two  (semi-)viable  model  schemas  (those  mathematically  inclined  vs 
physically  inclined),  then  one  is  doing  dual  simulations  based  on  the  mapping  on  these  class  of 
individuals,  and  for  large  sets,  the  model  being  a  strong  compression  finds  specific  factors 
within  this.  Or  trivially,  how  does  one  split  the  Liar's  Paradox  as  someone  have  a  vendetta  for 
any  reason  and  may  consciously  miss-answer  test  questions. 

These  'problems'  abound  and  tend  to  plague  most  simulations  of  social  theory,  but  for  a 
particular  identified  model  structure,  one  can  justifiably  take  formal  stances  (i.e.  claiming  a 
philosophical  stance)  that  allows  for  viable  modeling  efforts.  (Borsboom  et  al  2003)  have  a 
large  and  extensive  review  here  that  covers  the  large  categories  on  perspectives  that  also  has 
references  to  perspectives  per  model  type.  This  quickly  relates  to  large  discussions  on 
causality,  ontology,  epistemology  etc.,  so  we  will  only  pause  briefly  to  mention  that  the  formal 
stance  is  a  philosophical  one  not  per  se  one  gained  fully  from  analysis.  This  is  important  as 
during  the  review  it  is  noticed  the  care  that  literature  keeps  to  relating  formal  stance(s)  towards 
the  measured  data,  dimensional  significance,  covariate  choicing,  modal  classes  /  variable 
typing,  and  modalities. 

For  an  IQ  modeling,  one  explores  the  scores  for  interactive  factors  to  principal  factors,  and  then 
explore  the  score  space  itself  for  confirming  those  factors  against  a  reasonable  variety  of 
stances  or  previous  modalities.  The  difficulty  becomes  in  the  dependent  nature  between 
separating  phenomena  within  a  covariate  space:  Is  this  factor  in  the  deviation  directly  tied  to 
difficulty  in  language  between  the  group,  specific  question  type,  or  generally  a  measured 
answer  of  a  general  factor?  These  become  large  standard  moments  that  are  difficult  to  place  in 
terms  of  epistemology:  is  clustering  behavior  reducible  to  a  single  source  and  is  this  source 
attributable  to  a  manifest  or  latent  variable?  This  requires  iteration  to  investigate  the  model 
for  its  nature,  and  iteration  of  measure  to  inductively  claim  that  this  holds  across  environment 
and  individuals  less  it  fail  being  'general'  intelligence.  There  is  some  reasonable  art  in  taking 
formal  stances,  but  clearly  the  variety  contained  in  most  psycho-social  systems  requires  a 
breadth  of  analysis  and  algebraic  considerations. 

As  PCA  and  CFA  maintain  the  same  algebraic  description,  the  two  analyses  have  consistent 
overlap.  From  an  axiomatic  model  standpoint,  there  is  no  maintenance  of  a  difference  in 
numerical  models  as  a  model  algebra  is  transposable.  Consideration  within  mathematical 
psychology  according  to  Bartholomew  et  al  (2002)  does  not  contend  this  as  "in  fact  the  two  can 
be  indistinguishable".  The  methods  then  are  distinguished  by  'meta-purposes'  such  as  "intent", 
"hypothesis",  and  "experimental  considerations".  In  fact  within  several  reviews,  the  analysis 
program  types  are  often  "done  in  tandem"  or  "sometimes  indistinguishable".  Considering  the 
IQ  model,  analyzing  answer  clustering  that  yield  distinguishable  latent  fit  to  an  identified  model 
(EFA)  often  then  makes  good  candidates  for  first-order  factor  models  (CFA)  often  using  nearly 
the  same  computational  analysis. 

The  difference  then  comes  under  extension.  When  extending  across  groups  alternate  ordinals, 
as  touched  on  one  has  to  consider  certain  breaks  in  commutativity.  Then  PCA  and  CFA 
distinctions  become  important  as  say  a  PCA  on  individual  (sub-group)  scores  may  not 
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(necessarily)  be  congruent  with  a  CFA  model  on  the  total  group  or  total  phenomena  of  interest 
in  that  group.  Again  for  IQ,  it  has  been  found  that  certain  PCA  that  yield  certain  specific 
performance  groupings  are  not  nicely  congruent  over  a  CFA  program  with  generalized 
intelligence  measure  model  (Florn  &  McArdle  2012).  So  any  set  factoring  on  an  individual  does 
not  (necessarily)  seem  to  be  a  reliable  sub-group  on  the  total  (algebraic)  grouping.  This 
certainly  opens  up  the  perplexing  areas  of  complexity  present  in  social  sciences  as  elsewhere. 
The  uniqueness  here  however  is  in  the  higher  order  inferences  that  present  strong 
considerations  of  method,  program,  and  an  'algebraic  intent'. 

In  reviewing  practices  in  latent  variable  theory,  it  is  not  surprising  that  effective  measurement 
programs  present  methodological  typing  and  considering  of  experimental  context.  As 
Jorgensen  notes,  good  exemplars  often  involve  iteration  and  cross  involvement  on  both  the 
analysis  method  and  other  type  considerations  (Jorgensen  Factor  Analysis).  Upon  review  there 
were  several  other  sub-typings  available  which  should  not  be  considered  exhaustive  but  rather 
exemplar  ideation  founded  within  LVT  review  in  the  next  section.  As  there  is  not  necessarily 
nice  compression  within  measured  groups  across  the  whole,  a  common  typing  found  was  to 
consider  unique  mereological  (i.e.  group  inheritance  rules):  considerations  found  were 
idiographic  (individual  ideation)  against  nomothetic  (general  ideation),  class  structuring,  and 
factoring  conscious  processes  &  unconscious  processes  differently  (Nesselroade  2012).  Sub¬ 
orderings  will  be  familiar  to  modelers  (categorical,  endogenous,  exogenous),  but  these  are  still 
considered  embedding  into  the  same  EFA,  PCA,  and  CFA  paradigms  so  inherit  the  potential 
'construct  validity'  issues  as  discussed.  The  listing  then  to  show  a  sampling  of  potential 
extensions  of  types  that  are  under  consideration,  and  as  there  are  not  clearly  reducible  logics 
due  to  the  seeming  grouping  problem,  modelers  have  to  then  maintain  these  'meta-method 
types'  under  their  activities. 

Now  considering  the  algebraic  effect  on  these  descriptions  begins  to  get  to  unusual  questions. 
As  we  had  the  functor  diagram  from  our  previous  sections,  left  with  the  epistemic  question  of 
finding  case  refinement  for  particular  models  and  system  of  interest.  The  experimental 
considerations  then  apply  to  a  typing  on  the  system  of  interest  such  as  a  considering  of  an 
idiographic  nature  predicates  available  objects  different  then  the  specific  objects  on  a 
nomothetic  system.  This  then  'compresses'  the  available  space  to  a  viable  state  space  via  the 
response  factoring  to  an  'item'.  LVT  has  shown  that  an  'item'  can  be  abstractly  extended  to 
other  behaviors  given  an  algebraic  unit  to  said  behavior:  "health  behavior"  ->  'visit  to  doctor'. 
With  enough  identification  or  experimental  set-up,  then  these  'nomologies'  then  are  thought  to 
form  a  space  which  hopefully  has  enough  'identification'  to  form  a  measurable  space;  within 
which  the  measurable  space  has  objects  that  correspond  to  what  is  ultimately  'ideas'  living  in 
the  'imaginary  numerals'  of  a  complex  space.  In  the  former,  one  tends  to  use  validity  on  a 
model  using  'construct  validity'  (if  the  overall  construction  on  the  phenomena  is  valid),  and  in 
the  later  one  tends  to  use  'epistemic  validity'  in  the  similar  sense  that  most  measure  theorists 
use.  This  is  contingent  on  the  seeming  unidentifiability  on  the  general  space  for  which  theory  is 
to  describe,  so  this  presents  the  usage  of  more  abstracted  types  to  dually  describe  the  validity 
on  enumerated  analysis  and  the  mapping  to  spaces  and  theories  with  which  it  is  to  be 
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contained.  This  hopefully  describes  why  often  the  classing  and  typing  of  models  are  more 
useful  to  generalized  analysis  then  combinatorial  methods  on  enumerated  models. 

The  above  discussion  seems  to  imply  a  complex  numeral  system  or  even  something  above  as 
the  bases  within  the  system.  While  there  were  stochastic  and  other  methods  found  that 
touched  on  complex  methods,  material  on  the  relation  to  complex  analytic  theorems  or  larger 
order  topology  were  not  found  as  of  the  time  of  this  review.  However,  there  were  interesting 
investigations  in  this  area  including  complexity  (Byrne  &  Callaghan  2014)  and  even  categories 
(Phillips  &  Wilson  2010).  Still,  this  work  has  not  yet  gotten  to  the  where  these  considerations 
can  be  integrated  into  LVT  in  practice. 

In  trying  to  formulate  these  ideas,  the  larger  generalized  information  on  'constructs'  still  have 
some  relations  that  can  be  observed.  Even  if  IQ  cannot  be  readily  mapped  to  an  enumerated 
explanation,  across  several  studies  one  can  still  have  a  compressed  description  that  there  is 
some  objective  (if  not  fuzzy)  relation  between  'cognitive  ability'  and  'itemized  behavioral 
performance'  (or  likewise  generalizing  terms).  So  one  can  observe  'fuzzy',  or  'softer'  theoretic 
descriptions  that  are  only  computable  under  'identifying  constraints',  but  the  general 
relationship  can  still  be  accessed.  Social  theory  cannot  be  too  simple  or  else  computable 
architectures  would  be  more  readily  available. 

This  leads  to  an  interesting  area  on  the  best  representational  means  in  'communicating'  these 
general  'nomologies'  while  maintaining  computability  or  at  least  'modelability'  where  generally 
observed.  Cronbach,  Shadish,  and  Trochim  have  interesting  discussions  where  all  suggest 
variation  on  a  'nomological  network'  in  which  to  describe  these  observable  relationships.  In  all 
there  is  consideration  towards  the  generalized  object  groupings  and  their  modal  extensions 
which  'tie  together'  (e.g.  'intelligence'  ->  'creation  on  specific  skill'  ->  'performance  grouping'  -> 
'Item  performance').  The  thinking  being  that  these  'networks'  notate  more  general  hypotheses 
that  can  make  use  of  inductive  evidence  to  support  themselves,  but  then  dually  have 
competing  general  theories  with  which  to  form  more  specifying  (and  thus  identifiable) 
hypotheses.  These  are  then  abstract  ordinal  theory  that  are  maintained  congruently  across  set 
hypotheses.  This  area  is  explored  more  in  other  sections  but  to  show  the  intuition  on  the 
'abstract  encoding'  that  social  theory  entails. 

Again  explored  more  in  depth  in  other  sections,  this  is  an  equivalent  description  on  category 
theoretic  descriptions.  Briefly  a  category  defined  mathematically  is  any  abstract  object 
equipped  with  some  'morphism'  (model  schema,  algorithm,  more  abstract  'functor'),  and  there 
is  growing  mathematical  theorems  which  may  help  provide  means  for  specifying  these 
'nomologies'  potential  into  situatable  'ontologies'  by  progressing  the  generalized  space  to  a 
type  set  ordering.  Below  explored  is  an  attempt  to  represent  these  generalizing  ideas  with 
formal  functors.  These  have  a  nice  morphism  between  them  but  at  least  upon  review  these 
only  seem  to  happen  given  again  particulars.  PCA  against  CFA  can  gain  further  refined 
description  by  considering  the  diagram  again  for  the  model  apportionment.  From  the  point  of  a 
particular  system  split  by  observables  and  then  their  covariant  space  translation. 
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Figure  7:  PCA  as  Colimit  Adjoint 

At  the  beginning,  the  starting  assumption  is  that  these  are  not  reliably  commutative  on  neither 
the  covariate  nor  observable  sub-closures.  Then  as  one  searches  to  find  reliable  objective 
'descriptors'  (i.e.  functors)  one  has  to  consider  that  the  left  and  right  (co)setting  are  then  not 
reliably  symmetric.  So  one  has  to  consider  exploring  into  the  system  space  from  the  item 
response  (i.e.  intuition  upon  EFA),  and  one  has  to  consider  mapping  the  system  reasonably  to 
said  set  item  (i.e.  intuition  upon  CFA).  In  cases  where  the  system  is  'identified',  this  seems 
equivalent  to  saying  that  the  left  and  right  setting  are  commutative  and  thus  negligible  (hence 
PCA  being  a  simplifying  case). 

So  one  has  to  analyze  'each  side'  for  both  exploratory  (defined  functions  varying  object 
descriptions)  and  explanatory  (defined  objects  varying  descriptions),  but  then  has  to  find  a 
matching  colimit  as  a  description  must  be  an  inverting  match  back  to  the  observable.  In  many 
measurement  areas,  this  distinction  can  be  trivial,  but  for  a  social  system  that  can  literally  if  not 
slowly  inject  abstractions,  this  triviality  is  not  so  trivial.  Also  as  social  modeling  measures  must 
explore  ordinals  and  other  structural  typing,  this  adds  the  requirement  to  match  these 
functions  to  the  same  invertible  functors(s)  in  higher  order  spaces.  If  able  to  be  set 
epistemically  or  identified,  then  the  observable  itself  becomes  nicely  epistemically  recursible. 

This  then  can  vary  the  theoretic  algebraic  grouping  (i.e.  object,  function  match)  across  higher 
orders.  Both  objects  and  functions  are  being  searched  across  a  bi-directional  branching.  As 
such  EFA  can  be  thought  as  the  analytic  exploration  particularly  on  the  'left  side'  of  the 
diagrammatic  program,  CFA  as  the  usually  algebraic  exploration  particularly  tracing  from  the 
'right  side'  of  the  program,  and  PCA  a  designation  on  exploration  upon  sufficiently  reusable 
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theory.  As  well  the  variability  in  these  object,  operation  pairing  then  presents  questions  not 
only  on  the  'construct'  validity  concerns;  or  whether  or  not  a  particular  analytic  pairing  can  be 
said  to  fit  an  available  construct  within  the  system  of  interest.  One  might  find  quantum 
theoretic  objects  description  but  then  are  quantum  associated  operations  available  in  the 
system?  Even  if  so  is  this  quantum  congruence,  at  the  measured  variant  or  the  latent? 


Definition:  (F "LVT  Construct 


st.  d(Si)  =>  F' 
Si/f=>  F 
A  3/r 


where  -  'topology  over  F' 

d(Si)  -  'distance  measure' 
/*  -  'recursible  function' 


Figure  8:  Construct  Definition 

Then  posed  is  the  extent  to  which  considerations  become  available.  Even  though  splitting 
these  groupings  numerically  becomes  difficult,  representing  them  seems  a  natural  human 
ability.  So  one  can  clearly  identify  these  as  a  potential  encapsulated  analysis  and  thus  coded 
informata.  While  there  is  general  treatment  as  noted  and  the  base  formulation  is  extended  to 
other  modeling  paradigms,  attempts  to  find  formal  theoretic  information  treatment  was 
limited.  For  instance,  one  may  think  to  use  an  ontology  mapping  for  tracing  typing  on  models, 
and  instead  find  methodology  in  which,  numerical  methods  become  of  low  cost  and  human 
intuition  can  produce  more  readily  changeable  generalized  nomological  hypotheses  (invoking 
similar  patterns  as  'human-on-the-loop').  As  various  diagrammatic  formulation  with  varying 
morphic  properties  becomes  possible,  these  would  be  couched  in  a  natural  language  yet  would 
need  a  settable  representation  for  computable  results. 


5.2.4  Multi-Leveling,  Data  Typing,  and  Mixture  Models 

In  a  formal  manner,  then  PCA  modeling  is  based  in  exploring  linear  (matrix)  algebra  for  the 
latent  covariate  profiles.  A  predictive  matrix  algebraic  equation  is  then  explored  for  fit 
statistics.  The  variation  and  judgement  is  adjusted  fitting  different  underlying  model 
constructions  along  with  model  complexity  judged  against  those  fit  statistics.  So  for  IQ  testing, 
choices  may  look  against  assigning  scores  as  manifest  variables  or  measured  variables  or  also 
other  sub-typings  (e.g.  continuous  profile  vs  categorical)  with  which  grouped  scores  again 
exploratory  statistics  along  with  exploring  potential  principle  component  assignment.  The 
result  then  is  establishing  from  data  both  a  variable  analysis  and  an  analysis  of  abstraction 
structure.  As  for  identified  model  structures  with  sufficient  justification,  one  can  embed  models 
and  make  use  of  multivariate  and  multimethod  simulations.  This  is  a  developing  area  moving 
towards  mixture  modeling  and  classifying  identifiability  and  extensibility  in  practice.  Reviewed 
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here  will  be  the  methodology  for  embedding.  Also,  we  will  present  models  that  may  be  of 
interest  to  social  system  modeling  and  efforts  in  the  systems  community. 

Once  one  has  established  a  covariate  notation,  the  base  format  can  be  extended  across  various 
variable  types.  One  can  represent  measured  ordinals  variables  by  establishing  a  mapping  from 
a  latent  continuous  variable  implying  a  fuzzy  inference  mapping  to  sectioned  items: 

xt  =  a  ^  <  x*t  <=>  t^\  <  x*t  < 

1 1  ^  |  =  m  —  1 

t®  —  t ^  =  totally  ordered  thresholds, 

|t«|  =  m  =  "number  of  variable  categories" 

Then  one  maps  this  to  a  latent  model  format  by  setting  the  measure  variable  to  mean  1  and 
variance  zero  (as  it  is  assumed  categorical),  and  the  factor  loadings  then  map  to  threshold 
values.  This  then  has  identifiable  solutions  depending  on  multi-stage  estimation  (Muthen  1983) 
(Joreskog  1990,1994).  With  the  available  latent  objects,  there  are  strong  examples  then  of 
translating  categorical  behavior  to  continuous  space  for  analysis  (Bartholomew  et  al  2002). 
There  are  then  interesting  behavioral  examples  as  one  can  then  gain  models  that  can  create 
predictive  models  which  have  a  restricted  domain  (time  delimited  health  check-ins)  toward  a 
continuous  probability  measure  set. 

Since  there  are  successive  ordinals  on  the  model  there  are  several  choices  for  the  model(s) 
although  the  covariate  algebraic  principles  remain  the  same.  Useful  then  is  establishing  the 
assumption  to  the  model  showing  object  type  and  variate  connection  to  the  analytic  models; 
this  is  where  the  graphical  format  becomes  useful  to  capture  the  assumed  structure  between 
measured  variables.  This  then  allows  scripted  procedural  methodology  by  equation  and  profile 
structure:  graphical  combinatorial  type  ->  model  set  ->  method  program.  Then  one  can 
presume  (and  many  have)  identified  categories  for  particular  structures. 

For  example  a  'path  analysis'  has  a  representable  diagram  as  such  below: 
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Figure  9  -  Path  Analysis  (Adapted  from  Moustaki  et  al  2015) 

One  can  then  here  estimate  latent  effects  measured  a  priori  that  might  determine  latently 
determined  posterior  measurements.  A  clear  example  is  a  personality  response  that 
determines  a  behavioral  profile  that  then  shows  success  'factors'  that  determine  program 
outcome.  This  is  also  a  common  form  within  public  health  behavior  in  trying  to  find  priori 
individual  factors  that  across  time  determine  measurable  biological  responses  (Wall  &  Li  2009). 

Generalizing  the  various  considerations  after  the  model  set  up,  obvious  is  requiring  a 
convergence  algorithm  to  translate  open  space  to  a  measurable  space.  A  maximum  likelihood 
estimate  is  used  based  on  Bayesian  information  consideration  per  individual  involved  models. 
Then  using  multivariate  covariate  structures  one  can  mix  latent  variables  by  an  assumed  model 
class  assumption;  e.g.  linear  model,  growth  models,  etc.  Then  mapping  these  model  class 
objects  to  our  latent  structural  objects  one  can  provide  latent  dimensionality  to  an  analysis: 
presuming  then  one  might  search  for  possible  ordinal  responses  within  traditional  technical 
analysis.  This  presents  LVT  as  a  means  for  identifying  a  'space'  in  which  a  behavior  might  create 
(Wang  &  Wall  2003).  As  one  hopefully  notices  this  mimics  similar  structure  to  their  likewise 
analytic  cousin,  but  with  LVT  the  purpose  is  to  separate  the  'social  variant'  explicitly  as  possible 
to  the  'latent  space'  and  then  by  extension  manifested  variables  can  be  model  in  familiar 
technical  ways  using  time  series  variables  from  ordered  measured  variants. 

Multi-indicators  and  multiple  causes  (MIMIC)  is  a  standard  form  for  regression  analysis  using  a 
single  factored  endogenous  latent  variable  against  multiple  possible  exogenous  covariates.  An 
example  is  PTSD  determination  (a  disease  potentially  linked  to  several  environmental  factors) 
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parsed  amongst  a  group  of  individuals.  Useful  for  determining  individual  latent  behavior 
against  several  potentially  compounding  inputs.  A  modelable  example  is  done  by  (Gerwirtz  et 
al  2010)  looking  at  the  multi-method,  multi-group  effects  surrounding  posttraumatic  stress 
symptoms  and  behavior.  This  then  leads  to  future  a  policy  analysis  on  latent  space  of  potential 
factors  that  helped  to  explain  some  unanticipated  effects  on  treating  PTSD  against  different 
policy  groups. 

Of  last  example,  particularly  related  to  complex  system  modeling  are  takes  on  multi-leveling 
and  dynamical  estimation.  If  one  thinks  to  map  a  linear  latent  equation  to  an  individual  as  has 
been  shown,  then  one  can  gain  a  latent  model  mapping  onto  these  measured  determinants. 
Then  one  maps  these  determinants  as  inputs  to  another  latent  "level".  A  physical  example  is 
modeling  person  specific  education  model  and  then  tracing  the  change  in  score  determinants 
(e.g.  gender,  race,  baselines)  against  change  in  environmental  factors  (e.g.  classroom  changes 
year  to  year).  The  basic  intuition  is  to  have  a  linear  regressed  model,  but  the  slope  (Individual 
'Level  1')  determined  year  to  year.  Then  one  has  a  latent  estimate  against  the  intercept  to 
these  regressions  (Classroom  'Level  2')  assessed  against  ordinal  encapsulations  (i.e.  upon  yearly 
classroom  changes).  Similarly  growth  models  are  estimated  by  mapping  the  latent  objects  to 
being  the  parameters  to  an  identifiable  system  of  dynamical  equations.  Then  one  estimates 
latent  growth  factors  by  regressing  against  the  curve  profile  (Bollen  &  Curran  2006). 

The  model  types  become  extensive  as  one  matches  a  latent  profile  against  an  identifiable 
constraint  (epistemic  validity)  and  against  the  experimental  description  that  allows  theorem  to 
reasonably  map  to  those  identifiable  constraints  (construct  validity).  That  said  other  interesting 
model  estimation  programs  are  available  or  under  research.  It  is  useful  to  note  here  that 
research  requires  not  only  identifying  the  statistical  solution,  but  also  inductively  showing  this 
can  reasonably  represent  a  contextual  class  of  behavior. 

Below  is  a  non-exhaustive  sampling  of  LVT  paradigms  encountered: 

•  Multiple  and  Multivariate  Regression 

•  Analysis  of  Variance  (ANOVA) 

•  Multi-Group  Analysis  for  Categorical  Data  (Millsap  &  Yun-Tein  2004) 

•  Latent  Recursion  Models 

•  Non-Linear  Growth 

•  Multivariate  Latent  Class  (Collins  &  Wugalter  1992) 

•  Autoregressive  Latent  Trajectory 

•  Indicators  for  Latent  Exogenous  Variables 

•  Latent  Growth  Curve  Mixture  Model 
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•  Latent  Nested  Ordinal  Models 


Last  to  mention  is  estimation  on  latent  classes  based  on  these  latent  models.  The  output  being 
a  categorical  informational  variable  that  attempts  to  determine  the  class  and  type  relation.  The 
object  mapping  is  then  to  an  underlying  probability  space  implying  Bayesian  network 
combinatorically.  Then  one  assesses  the  marginal  probability  space  using  information  criteria. 
For  the  extended  categorical  data  setup,  (Moustaki  &  Knott  2000)  provides  a  good 
generalization,  and  (Mejlgaard  &  Stares  2010)  shows  a  basic  case  example.  Then  it  is  thought 
to  potentially  identify  an  embedded  ordinal  model  within  a  class  (Joreskog  &  Moustaki  2001) 
which  might  be  useful  to  provide  estimation  using  Bayesian  nested  models. 

As  presented,  there  were  useful  categories  given  the  wealth  of  modeling  encountered.  At  the 
conceptual  abstraction,  multi-modeling  falls  back  on  probability  and  frequency  assessment  and 
thus  have  translated  criteria  to  system  models  from  information  theory  background.  The 
additional  criteria  from  the  dual  PCA  and  CFA  factoring  is  to  pay  attention  to  the  probabilistic 
frequency  for  both  observed  and  latent  variants:  termed  observed  frequencies  and  expected 
frequencies  respectively.  To  represent  these  'predicted  frequencies'  there  is  then  assumed 
profiles  based  on  categorical  type  of  the  model  variable.  An  interesting  finding  was  the  extent 
to  which  information  criteria  used  a  deviation  on  'traditional'  Bayesian  Inference  (Bayesian 
Inference  Criteria  BIC)  as  this  prefers  more  parsimonious  models  than  necessarily  explain 
phenomena  of  interest.  Primary  alternatives  were  found  as  Akaike  Information  Criteria  through 
bivariate  marginal  residuals  (Maydeu-Olivares  &  Joe  2008;  Reiser  2008;  Bartholomew  &  Leung 
2002)  as  these  are  thought  to  be  more  indicative  of  latent  social  variants. 

After  these  concerns  and  considerations  seem  to  get  lengthy,  and  there  is  a  large  growing  body 
a  knowledge  (Jones  &  Thissen  2007).  Prominent  terms  were  quantitative  and  qualitative 
considerations  for  epistemic  and  construct  validity  respectively.  As  well  large  scale  modeling 
involves  matching  with  appropriate  '-ology'  within  the  subject  area:  e.g.  personality  to 
psychology,  group  behavior  with  sociology.  The  primary  agenda  encountered  amongst 
practicum  groups  was  dually  expanding  the  quantitative  available  constructs  in  psychometrics 
and  econometrics  particularly  two  areas  that  primarily  measure  agent  and  enterprise  policies 
respectfully. 


5.2.5  Structural  Determination  &  Operationalizing  LVT 

Here  one  can  mention  how  latent  factoring  in  social  science  methods  begin  to  break  more 
significantly  from  physical  sciences.  Physics  often  deals  with  latent  factors.  For  example,  a 
thermometer  is  influenced  by  latent  factors  such  as  atomic  collisions,  but  it  is  sufficient  to  deal 
with  the  abstraction  by  statistically  aggregating  the  collisions  (physics  latent  description  of 
heat).  An  even  more  interesting  example  is  development  of  theory  that  explains  fusion  within 
stars.  Even  though  physicists  cannot  isolate  a  star,  examine  its  interior,  or  conduct 
experiments,  they  were  able  to  develop  a  theory  that  explains  how  it  works.  This  theory  is 
based  on  a  number  of  phenomena  that  are  not  directly  observed.  However,  we  know  that  this 
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latent  model  works  within  certain  predictable  bounds  (e.g.  temperature,  pressure  ranges  for 
material  state),  as  one  can  then  observe  measurable  objective  phenomena  within  some  defined 
convergence  space. 

In  a  similar  fashion,  latent  statistical  criteria  have  bounds  defined  within  the  experimental 
design  by  the  congruence  between  construct  objects  and  then  implied  theorems  (e.g.  rational 
agent  theory  to  econometrics).  However,  humans  have  the  ability  to  describe  their  own 
behavior  and  potentially  modify  it.  This  quality  makes  it  increasingly  difficult  to  have  an 
objective,  (stationary)  constraint  about  the  embedded  model  algebras.  Physics  has  centrally 
established  settable  theoretic  descriptions  (e.g.  Newton's  Law,  Thermodynamics,  and  Quantum 
Theory)  whose  internal  ordering  is  not  influenced  by  human  action;  particles  presumably  do  not 
possess  a  humanly  conscious  language.  Even  biological  entities  have  a  different  abstraction 
necessary  for  their  study,  and  human  social  sciences  along  the  same  extension.  Social  systems 
do  have  stability  that  one  could  model.  However  it  is  then  understandable  as  modeling  efforts 
have  to  actively  check  if  the  underlying  construct  is  'actual  anymore'  as  the  ordinal  space 
fluctuates.  This  is  quite  complex  against  the  model,  but  can  be  simple  to  a  'human  system':  e.g. 
if  I  am  disagreeable  and  suddenly  learn,  I  can  actively  work  against  the  measure  program  by 
thinking  consciously,  critically,  or  even  schizophrenically.  This  is  a  simple  enough  decision  even 
for  us  to  read  and  understand,  but  to  a  model,  this  would  look  like  ordinally  embedded 
response  profiles.  So  'simplicity'  and  ordering  likely  have  different  properties  here. 

Complexity  within  social  theory  then  has  a  'perspective  problem'  against  a  human  reasoned 
model  and  where  this  model  ordinally  appears  in  'real,  measurable  space'.  This  brings  up  an 
element  of  post-modernist  theory  which  we  wish  to  only  mention  here,  but  may  be  relevant  to 
be  aware  (Susen  2015).  The  complex  part  of  'human  measure  of  interest'  is  self-description 
and  conscious  language  as  these  are  the  same  phenomena  with  which  we  use  scientific  study; 
scientific  theory  is  expressed  in  a  human-readable  language  in  the  end.  As  many  social 
psychology  studies  have  found,  'priming  effects'  are  prevalent  in  termed  'self-fulfilling 
prophecies'  (Borsboom  et  al  2003),  so  these  are  centrally  relevant  to  the  abstractions  involved 
that  even  with  objectively  set  objects  the  underlying  abstractions  are  mutually  dependent  to 
the  phenomena  of  study.  And  thus  independent  settable  objects  are  not  (necessarily)  always 
obtainable;  or  rather  are  attainable  but  then  not  fully  objective.  The  short  answer  to  this 
problem  is  consciously  considering  the  experimental  context  and  the  construct  validity  issues 
(Shadish  et  al  2001).  But  the  underlying  difficulty  is  the  injection  that  causes  the  system  to 
change  algebraic  structure. 

As  seen  previously  within  an  established  social  science  measure  theory,  these  theoretic 
descriptions  are  not  as  easily  'insertable'  or  rather  only  under  condition.  Alternate 
representations  seem  to  present  diverging  combinatoric  and  algebraic  profiles  even  under  the 
same  observable  quanta.  These  again  are  not  absent  compared  to  physics  models,  but  one  can 
observe  rather  the  extent  to  which  when  dealing  with  human  phenomena  abstractions  on 
objects  are  more  complex  from  an  algebraic  perspective.  Often  then  there  are  lack  of 
symmetries  at  certain  abstractions  for  system  models.  This  begs  then  the  question  the  extent 
to  which  embedding  certain  methods  are  available  in  the  modeling  process  prevalent  with 
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psych  or  social  phenomena.  Particularly  as  humans  present  conscious  choice  to  their  actions 
any  informational  modeling  may  be  remiss  in  its  assumed  algebraic  space.  Hence 
understandably  why  so  many  are  concerned  with  the  generalized  'construct  validity'  along  with 
the  traditional  epistemic  validity.  The  heuristic  method  is  to  dually  class  and  type  phenomena 
and  models  combinatorically  which  provides  for  growing  paradigms  for  modeling  which  should 
actively  trouble  model  theory  within  socio-technical  spaces. 

However  one  can  then  ask  what  leverageable  structure  is  available  for  modeling  any  'human' 
phenomena  at  least  over  iteration.  Within  psychometrics,  there  is  reliance  on  the  covariate 
structure  inherent  to  the  response  profile,  but  the  choice  on  modeling  is  by  a  'human  system'. 
So  generally  a  broad  program  is  'softer  analysis'  within  researchers  trained  in  psychological 
analysis  (note  not  necessarily  psychoanalysis  as  a  theory)  which  gain  general  behavior  classes 
and  types.  This  is  then  crossed  with  available  model  schemata  or  other  theory  mapped  to 
'modellable'  compressed  objects.  This  all  seems  centered  on  the  lack  of  'centering'  on  our 
language  and  subsequent  responses. 

One  then  gets  an  impression  on  how  data  analysis  tends  to  yield  divergent  theory  in  social 
science  rather  than  converging  theory  in  physical  sciences  as  one  has  multi-dimensional 
statements  collapsible  into  theory.  For  example  there  is  theory  confined  to  health  behavior 
that  make  use  of  other  general  theory  such  as  social-cognitive  theory  which  captures  as  well  as 
latent  variable  theory.  However  aspects  then  claim  schema  theory  (compression  by  observed 
conscious  schemas)  and  behaviorist  theory  both  of  which  not  thought  to  be  'contained'  in 
either.  This  implies  there  is  not  a  reasonably  strict  hierarchical  mereology  that  are  common  in 
other  physical  science:  e.g.  quantum  theory  is  thought  to  'map  up  to'  fluid  dynamics  upon 
sufficient  scale  even  if  the  particular  dynamics  are  not  identified;  otherwise  one  would  have  a 
rough  'parthood'  ordering  relation  at  least  by  invoked  phenomena.  In  social  theory,  it  is 
difficult  to  induce  this  mereology  hence  as  the  theoretical  status  on  theory  objects  are 
seemingly  constantly  debated  and  many  times  have  several  philosophical  stances,  schools  of 
thought,  and/or  contextual  sub-theories. 

This  can  then  generally  be  transferred  to  any  response  item  and  then  any  objective  results 
mirroring  that  structure  as  one  just  creates  higher  moments  around  a  particular  response  item. 
From  this,  it  is  a  programmatic  method  for  assessing  intended  performance  for  intervention. 
This  then  usually  has  an  implied  n-tuples  to  its  'theoretical  algebra'  as  it  usually  claims  a 
particular  stance  or  experimental  context.  This  severally  limits  the  extensibility  and/or 
composability  on  social  models  as  while  these  higher-order  theories  have  support,  but  it  is  not 
well  understood  how  these  theories  sans  models  can  be  reliably  composed  as  there  are  several 
'n-tuples'  that  a  particular  measure  model  'comes  from'.  Due  to  this  difficulty,  there  is  no 
known  tractable  (or  semi-tractable)  program  for  model  composition  within  social  theory  (Taylor 
et  al  2015;  Morse  &  Schloman  2011)1 

However  seemingly  converse  to  this,  there  are  several  case  examples  on  successful 
interventions  even  from  areas  involving  what  would  be  diverse  theoretic  areas.  There  are 
obvious  successful  enterprises  and  other  social  implementation  profiles.  Even  then  posteriori. 
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these  enterprises  then  document  these  theoretic  model  objects  within  artifacts  (e.g.  reports, 
publication,  models,  databases  etc.).  An  important  example  is  the  use  of  the  rational  agent  in 
economics.  The  bulk  of  economic  theory  which  is  based  on  vonNeumann-Morgenstern  utility, 
agent  network  theory,  and  economic  metric  theory  has  within  a  'clearer'  mereological 
hierarchies.  Even  behavioral  economics  seems  to  show  that  covariate  profiles  bias  and  provide 
more  indirect  influence  on  economic  decisions.  So  what  then  one  asks  is  the  seeming  theoretic 
disconnect!?  This  does  seem  the  'missing  piece'  to  more  generalized  modeling  with  social  or 
socio-technical  spaces  as  the  variety  of  general  theory  is  this  'theoretic  convergence  problem' 
what  social  theory  might  term  the  'nomological  convergence'. 

Now  this  is  to  point  out  the  structural  congruences  within  these  LVT  methods.  Certainly  each  is 
concerned  with  different  phenomena  and  uses  models  related  therein,  but  indistinguishable  is 
the  algebraic  form  using  covariant  objects  and  normalized,  standard  moments  at  that.  What  is 
notable  is  the  extent  to  which  this  provides  an  ongoing  useful  space  for  capturing  and  exploring 
human  behavioral  phenomena,  and  then  across  different  first  moment  models;  sometimes 
independent  of  choice  in  first  order  model.  Not  surprising  is  the  extent  to  which  these  have 
been  explored  as  'warning  signals'  to  'spot'  model  bifurcations  within  other  areas,  and  also  use 
the  same  case  for  spotting  major  changes  in  psychology:  changes  in  affect  (Sinharay  2016; 
Edworth  et  al  2003;  Nordin  &  Kaplan  2010),  econometrics  (Cho  &  White  2007;  Blundell  &  Robin 
2000),  onset  in  group  social  state  (Levy  2005;  Nyborg  et  al  2016),  and  larger  human- 
environment  systems  (Bauch  et  al  2016;  Boettiger  &  Hastings  2013).  Finding  a  recurrent  signal 
amongst  a  covariant  field  is  then  strongly  encouraged  and  at  least  within  social  sciences  latent 
variable  theory  appears  to  be  the  growing  objective  standard  for  'warning  signals'. 

The  obvious  problem  is  back  mapping  through  social  theory  to  identify  even  the  general  'causal' 
phenomena.  This  may  or  may  not  be  possible  within  an  automata  theoretic  program,  but  it 
may  be  possible  through  injection  by  those  possessing  similar  conscious  language  (e.g.  subject 
matter  experts);  i.e.  automata  may  not  have  a  programmable  method  but  paradigms  such  as 
'human-in-the-loop'  or  'human-on-the-loop'  might  allow  it.  Generally  it  was  found  that  LVT 
was  used  for  performance  evaluation  or  (through  its  statistical  cousins)  general  enterprise 
factor  exploration.  However,  its  use  to  inform  the  integration  of  models  was  not  found. 
Hopefully,  this  implies  future  research  potential. 

In  fact,  from  a  mathematical  standpoint,  artificial  intelligence  and  machine  learning  techniques 
leverage  this  by  quickly  exploring  the  covariate  profile  expanse  across  various  standard 
moments  (although  they  are  programmed  over  some  ontology).  Although  these  methods  are 
still  in  their  relative  infancy,  the  informational  groupings  around  particular  spaces  are  still  used. 
Thus  one  would  expect  to  encounter  the  general  trade-off  within  automata  theory:  once 
programmed,  these  provide  computationally  cheap  information  calculation  within  halting 
bounds,  but  they  are  hard  to  assess  outside  a  particular  programmed  space. 

However,  it  at  the  same  time  raises  questions  gained  from  analyzing  the  algebraic  structure. 
Discussed  previously  is  the  extent  to  which  finding  a  leverageable  commutative  structure 
extends  between  moments,  ordinal  expanse,  and  ultimately  to  the  measurement  and  the  'real 
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human'  phenomena.  Ideally,  one  finds  at  least  a  shifted  center  across  the  moments  which  then 
can  solve  the  objective  item  from  the  'root'  of  the  'human  behavior  phenomena'.  This  could 
help  explain  why  engineering  efforts  to  implement  interoperate  models  has  not  been 
successful.  Certainly,  considering  orders  of  moments  can  be  tautological  in  causality.  As 
velocity  (a  moment  of  position)  is  bidirectionally  causal  to  position  (velocity  determines 
position  and  velocity  is  defined  by  position  change),  explaining  major  changes  in  first  order 
model  from  higher  moments  could  be  considered  natural  occurrences  within  a  particular  model 
hence  not  a  valuable  causality;  so  what  then  separates  uniqueness  in  human  phenomena 
within  open  covariate  sets?  These  would  still  be  useful  to  observe  (as  velocity  is  an  important 
measure),  but  the  epistemic  question  is  the  extent  to  which  a  change  is  "a  direct  result  from 
higher  orders"  on  the  functional  description  or  "represents  a  significant  change  in  model 
descriptions"  (Scheffer  et  al  2009).  Although  to  be  clear,  the  bifurcation  'warning  signals' 
within  biological  systems  assume  that  simply  identifying  decoupling  across  a  covariate  moment 
communicates  changes,  but  within  social  systems,  the  only  available  bifurcation  signal  generally 
described  and  theoretically  tested  seem  to  be  the  ordinal  covariate  space  formulation.  By 
Spearman's  theory  statement,  a  change  in  'social  natural'  must  be  'physically  and  unnatural 
anti-entropy',  so  inherently  one  is  measuring  epistemically  an  unattributable  shifting  signal 
within  a  shifting  signal;  not  impossible  but  (like  most  human  phenomena)  a  complex  task. 

Revisiting  the  toll  road  model  from  the  previous  research  task  (RT-138)  (Pennock  et  al  2016),  it 
was  discussed  that  finding  breaks  in  symmetric  ordering  did  not  have  a  sufficient  solution 
without  the  known  ordering  method.  One  of  the  reasons  is  that  individuals  respond  to  new 
signals,  but  the  change  in  order  on  the  behavior  left  us  with  a  non-linear  attribution.  Certainly 
using  a  marginal  price  model  makes  intuitive  sense  on  one  hand,  there  are  then  control 
questions  on  the  other.  A  priori  one  would  like  this  ordinal  'functor'  (what  are  people  actually 
thinking)  rather  than  the  specific  ordering  (how  are  they  acting).  The  former  allows  a  priori 
control  and  configuration  and  the  latter  might  totally  disorder  the  schema  as  individuals  were 
choosing  the  more  expensive  lane.  The  choice  to  the  engineers  and  decision  makers  is  that  one 
involves  a  configuration  change  and  the  other  upends  the  entire  set  toll  system. 

Then  one  considers  the  algebraic  observations  as  discussed.  For  a  particular  'latent  signal' 
measure,  one  normalizes  and/or  centers  around  a  particular  (algebraic)  grouping  behavior.  This 
is  the  basis  for  the  covariate  moments  which  then  creates  the  available  field(s).  As  can  be  seen 
with  examples,  identifying  the  location  of  these  commutative  groupings  (and  evaluating  then 
their  validity)  is  the  hardest  part  within  these  methods  and  seemingly  where  the  art  is.  As 
examples  within  psychology  phenomena  (e.g.  behavioral  economics  and  social  psychology),  the 
basis  structure  that  underlies  standard  models  contain  both  self-referent  and  bias  effects 
amongst  others.  This  can  directly  effect  the  algebraic  structure  at  least  considered  from  the 
formal  model  perspective.  For  example,  a  well-known  Keynesian  macroeconomic  description  is 
his  suggestion  that  people  in  markets  might  center  around  different  measures  in  terms  of 
behavior  but  could  also  consider  centering  around  others  behavior  (his  famous  'beauty  contest' 
example).  We  could  say  more  abstractly,  without  necessarily  taking  his  direct  claim,  that 
'people'  actively  change  the  structure  of  their  behavior  in  some  form  or  else  innovation  and 
creativity  would  not  be  human  traits.  Even  then,  broader  topological  and  complex  claims 
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further  complicate  the  potential  structure  (Casti  1982;  Kauffman  &  Johnsen  1991;  Dixon  et  al 
2010;  Wolpert  etal  2012). 

This  creates  an  underlying  difficulty  in  relying  solely  on  unitary,  formalized  system  description. 
Singular  'warning  signals'  measure  systems  were  not  able  to  be  found  that  did  not  have  some 
difficulty  with  above  discussed  limitations.  These  'systemic  methods'  and  'structural 
investigation'  were  common  research  topic  areas  amongst  social  science  areas.  However,  what 
was  noticed  is  the  extent  to  which  researchers  and  practitioners  use  the  methods,  but  use  more 
abstract  reasoning  across  algebraic  classes  mentioned  above.  For  instance  the  previous 
example  of  financial  systems  where  there  has  been  interesting  work  in  finding  classes  in  trading 
behavior.  The  bases  are  considerations  on  Brownian  objects,  but  while  the  formal  models  are 
not  universally  reliable  fits  to  constructs,  one  can  use  the  model  and  its  extension  to  reason 
about  when  a  market  might  move  through  'types'.  This  is  not  universal  in  its  next  step,  but 
provided  a  'signal'  to  people  familiar  with  what  a  type  might  entail.  The  example  classes  ('long- 
memory1,  'rough  stochastic1,  'non-memoried')  corresponded  to  major  changes  in  formal  model 
structure  respecting  the  traditional  Brownian  and  Bayesian  assumptions.  For  example,  if  one 
can  class  say  depression  screening  in  health  behavior,  knowing  when  a  group  suddenly  changes 
profile  is  a  strong  impetus  for  greater  attention  within  an  administering  enterprise  even  if  one 
cannot  gain  individual  by  individual  signaling.  Additionally  one  could  have  a  'dual  signal'  that 
are  orthogonal  with  MIMIC  for  individuals  and  multi-level  for  program  screening  which  could 
help  refine  potential  sub-groups  yielding  several  of  these  'signals'.  This  is  unlikely  to  be 
reasonable  in  an  automata  way  but  easily  'pre-identified'  based  on  the  context  by  conscious 
modelers  and  interventionists.  Upon  softer  analysis,  most  intervention  literature  seems  to 
share  this  intuition  while  implementing  LVT  methods  iterating  on  EFA  and  CFA  profiles  using 
various  types  of  data. 

While  each  model  had  varying  computational  potentials,  more  of  interest  was  the  analytic 
properties  that  could  be  assessed  from  each  model  class.  For  example  'long-memory'  provides 
more  informational  derivation  and  thus  solvability  for  determining  underlying  latent  effects.  In 
information  theory  terms,  given  a  certain  model  class,  the  underlying  'channel  space'  is 
determinable  under  a  certain  available  algebraic  structure,  or  it  has  a  diminished  structure 
which  then  is  determined  by  choice  on  model  paradigm.  Conversely  there  is  reason  that  a 
sufficiency  over  a  'signal'  would  shift  under  a  new  algebraic  system.  This  can  imply  both  the 
gain  in  'signal'  and  gain  in  analytic  and  computation  aspects;  e.g.  sample  size  profiles,  presumed 
data  schema  types,  computation  complexity  needs.  This  then  makes  available  potential  trade¬ 
offs  however  broad  or  abstract  that  could  drive  economic  or  decision  theory  considerations 
which  could  help  guide  enterprise  system  architecture. 

Additionally  within  discussions  of  the  models  and  given  by  the  construct  question,  there  are 
additional  questions  toward  which  model  classes  are  available  'in  the  real  world'.  How  do  both 
variables  and  extensions  on  a  model  provide  a  congruent  description  for  the  system  of  interest 
(i.e.  structured  behavior  while  trading)?  This  is  a  challenging  problem  as  this  implies 
considerations  of  meaning,  behavior,  structure,  and  ontology  which  are  difficult  issues  in  social 
theory.  While  we  explore  potential  solution  using  the  idea  of  nomological  network  across 
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category  theory  in  Section  7.2,  it  should  be  explicitly  stated  that  this  is  the  crux  of  the  challenge 
of  developing  more  generalized  methods. 


5.2.6  ConceptualTheory  Discussion 

Through  the  review  of  latent  variable  theory,  there  are  several  conceptual  points  that  can  be 
observed  toward  the  purposes  within  systems  engineering  as  we  have  alluded  toward.  The 
most  direct  being  the  formal  methodologic  approach  to  data  and  models  involving  any  human 
behavioral  phenomena.  More  removed  are  both  the  opportunities  or  rather  observed 
complications  if  one  attempts  to  create  a  formal  system  from  social  modeling.  And  from  a 
general  systems  perspective,  the  notion  of  construct  validity  and  similar  theoretic 
considerations  naturally  bring  up  notions  to  insert  into  epistemology  and  interoperability. 

One  first-consequence  observation  one  can  make  is  that  the  LVT  procedure  fits  a  category 
theoretic  encapsulation  or  rather  a  powerful  enough  language  for  its  theory.  The  category 
objects  being  the  variant  spaces  (plus  or  minus  embeddings  in  the  case  of  nested  models),  and 
the  morphisms  are  either  the  projections  of  these  spaces  into  'identified  space'  or  the 
computational  solving  when  able  to  be  'identified'.  Respectfully,  the  observational  variables, 
their  moments  however  defined,  and  then  the  choice  eigenvalue  cosetting  within  the  latent 
variable(s)  are  definable  algebraic  objects  then  typed  within  a  'construct  paradigm'  (e.g.  LISREL 
or  Multi-variate  models  implied  by  available  objects  with  an  experimental  set-up).  Each  has  a 
choice  on  model  structure  within  a  defined  measurement  (categorical)  object,  so  are  then 
presumably  settable  within  a  context. 

Now  the  categories  as  we  discussed  have  larger  extensions  in  pure  mathematics,  but  use  here 
provides  a  mathematically  reasoned  way  to  compare  technical  elements  (e.g.  statistics, 
computation)  against  'psycho-social'  ones  (e.g.  'intelligence',  'personality',  'behavior'). 
Particularly  as  the  'construct  validity',  using  Spearman's  intuition,  the  goal  then  may  be  to 
identify  first-order  models  that  capture  as  best  as  possible  naturally  described  elements,  and 
then  the  remaining  covariate  space  can  then  be  reasoned  to  hold  behavioral  phenomena  within 
its  'lens'.  This  or  else  the  covariate  measure  becomes  a  'bicategory'  which  requires  some  a 
priori  knowledge  of  the  structure;  possibly  why  areas  such  as  economics  have  more 
'granularity'  on  their  behavior  as  there  are  more  classifiable  theorems  but  why  then  behavioral 
economics  seems  more  complicated  given  the  prevalence  of  behavioral  considerations. 

For  systems  engineering  purposes,  this  'dual  space'  seems  to  be  recognized  by  many  either 
within  areas  of  intervention  science  (Strauss  &  Smith  2009)  or  enterprise  systems  research 
(Pennock  &  Rouse  2015).  This  then  begs  a  line  of  questioning  on  whether  are  not  there  are 
categorical  information  potentials  here;  from  (Strauss  &  Smith  2009)  other  references  with 
which  focus  under  systems  engineering  are  erdogic-ness  (Borsboom  et  al  2004),  measure 
abstraction  (Messick  1995),  informational  (Kane  2006)  and  categorical  (Clark  2006) 
incompleteness,  and  relative  validation  (Cronbach  1988).  This  is  not  surprising  given  from 
Thurstone's  take  as  being  a  practiced  engineer,  the  creation  on  statistical  (i.e.  'technical') 
measures  was  for  basing  the  behavioral  measurements,  and  realized  himself  the  "[injected] 
subjectivity  toward  choicing  an  objective  statistical  program"  with  which  others  can  respond  in 
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variety  (e.g.  correlation-causation,  priming,  self-ordering  etc.).  IQ,  although  well  researched 
and  practiced,  now  still  has  a  centralized  tautology  on  measuring  that  the  quotient  relationship 
still  has  set  representation,  so  our  intuition  may  be  that  this  still  shows  some  normative 
subjectivity  (although  at  some  point  an  Occam's  razor  argument  becomes  valid).  So  the  begged 
question  is  not  per  se  how  to  choice  the  optimal  measure  or  computation,  but  rather  which  one 
'focuses'  the  measure  such  that  tangible  dynamics  are  mapped  discriminately  between  objects 
and  psycho  social  dynamics;  what  researchers  seem  to  mean  when  discussing  'does  the 
measure  capture  the  correct  construct'.  This  then  implies  a  'topological  approach'  (e.g. 
Dedekind  cuts)  as  in  LVT  one  maps  manifested  (i.e.  tangible  items)  to  numerical  space  and  uses 
the  open  real  space  to  given  latent  objects  measure. 

The  possibility  on  providing  categorical  theoretic  encapsulation  is  that  this  provides  system 
engineers  the  possibility  to  have  reliable  theory  with  which  to  pattern  the  'socio-technical'  (or 
theory  where  possible).  If  one  has  an  architecture  such  that  human  behavior  can  be  mapped  as 
discussed  in  the  previous  paragraph,  there  can  then  be  expected  signaling  from  human  or  social 
elements.  Presumably  then  one  can  pre-analyze  a  particular  pattern  that  maps  to  an 
identifiable  LVT  model,  and  have  then  some  measure  on  what  needs  classification  (e.g.  data 
types,  LVT  structure,  model  combinatorics).  This  as  we  discussed  is  then  an  identification  on 
the  kernels  that  the  LVT  information  is  based:  mapping  on  the  topological  space,  and  the 
measure-computational  limits  implied.  As  well  as  LVT  as  provided  combinatorial  diagrammatic 
maps,  these  LVT  models  can  have  pre-set  categories.  Given  this  basic  category,  it  has  been 
shown  that  these  category  types  can  be  mapped  to  a  database  schema  (Spivak  &  Kent  2012),  so 
then  LVT  can  presumably  have  some  automata  on  them  for  their  analysis;  at  very  least  given 
programming  on  the  choice  eigenvalue  setting.  Also  given  by  Spivak  is  that  once  a  category  is 
sufficiently  defined  within  the  category  on  sets  it  has  mapping  directly  to  a  relational  database 
schema,  so  potentially  could  help  with  experimental  design  to  provided  more  agile 
experimental  schemata  something  that  limits  current  social  measure  in  practice  (Moustaki  et  al 
2015).  This  could  also  potentially  define  where  more  sophisticated  measure  such  as  machine 
learning  could  be  classed  similar  to  complexity  theory  in  terms  of  their  inference  potential. 

From  the  schema  outside  the  computational  programs,  this  also  gives  general  patterning  with 
the  engineering  design  and  architecture  itself  as  mentioned.  Many  areas  within  enterprise 
science  and  intervention  science  often  use  'soft  analysis'  or  'open  architecture'  methods.  The 
categorical  mappings  could  be  used  either  to  more  effectively  'translate'  these  expert 
formulations  into  procedural  practice  (or  rather  map  to  where  openness  appears  in  practice) 
and  where  LVT  or  similar  methods  could  help  validate  these  models.  Since  these  measures  deal 
with  selective  moments  and  strong  ordinal  complexity,  pre-defining  much  of  the  architecture 
will  be  difficult,  so  then  one  would  want  continual  monitoring  or  KPIs  as  needed  where  these 
'open  spaces'  lie,  again  where  LVT  statistics  can  assist.  Then  ideally  one  would  like  to  pre¬ 
identify  where  these  KPIs  should  be  scripted  and  an  objective  as  possible  framework  for  update 
and  response  and  would  then  need  to  at  least  identify  a  general  social  pattern  which  could  then 
have  a  knowledge  body  on  similar  constructions. 
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From  this,  it  is  our  conjecture  that  these  are  notions  on  abstract  structures  which  are  assumed 
to  be  induced  in  LVT.  Then  these  underlying  construct  questions  become  more  notable  when 
formulated  in  categorical  theoretic  terms;  either  by  expanding  a  base  category  or  decomposing 
property  over  a  generalized  functional  class.  The  underlying  questions  raised  by  performing 
engineering  activities  on  phenomena  that  are  human  at  some  component  level  then  inherits  a 
dialectic;  while  human  behavior  is  clearly  describable  by  formal  systems,  the  human  being 
consciousness  in  this  system  induces  choice  which  to  the  system  will  be  seen  as  state-linkage  or 
worse  at  a  categorical  level  by  higher  ordinals.  Then  having  knowledge  of  the  limits  on 
numerical  and  information  systems  by  Godel  and  Chaitin  respectively,  one  faces  an 
'incompleteness  limit'  that  defining  properties  exist  outside  an  embedding  on  the  formal 
system  itself.  This  problem  is  covered  in  so  called  'doxic  paradoxes'  (i.e.  'liar-like  paradoxes') 
and  similar  constructions  over  behavior  against  even  a  defined  area  such  as  game  theory 
(Koons  1992;  Simmons  1993). 

So  it  is  then  given  these  'unconstructed'  sets,  one  expects  that  it  is  impossible  within  bounded 
rationale  and  resources  within  the  formal  system  itself  to  have  a  program.  But  people  in 
enterprises  can  describe  this  openness  or  incompleteness  reasonably  well  if  not  poorer  at 
'knowing'  exactly  what  it  is;  identify  the  areas  for  openness  for  example  within  '-ilities'. 

However  then  one  deals  with  subjectivity,  relativity,  or  similar  paradox  that  exists  within  these 
'natural  human  logics'.  Now  one  can  delve  into  post-modernist  ontologies,  but  telling  within 
research  is  the  extent  to  which  these  behaviors  are  considered  within  the  fields.  But  this  can 
lead  to  heuristical  theory  and  observation,  so  will  need  some  underpinning  if  nothing  else  for 
tractability.  So  ideally  one  may  like  to  have  a  space  to  identify  past  behavior,  a  space  to 
translate  consciously  identifiable  elements  (i.e.  LVT),  or  at  least  have  a  measurement  paradigm 
to  update  based  on  linguistic  information.  But  note  these  require  some  objective  language  with 
which  to  guide  practice. 

Taking  a  general  approach  necessary  for  a  systems  engineering,  the  difficulty  is  not  then  in  its 
representation  per  se  as  one  can  easily  have  an  individual  model  or  a  group  model.  But  rather 
than  in  the  logical  programs  when  attempting  to  simulate  or  apply  these  together  theoretically. 
As  Borsboom  et  al  (2003)  note  in  an  example, 

"the  [factor  model]  in  research  are  between  subjects,  but  if  a  within- 
subjects  time  series  analysis  would  be  performed  on  each  of  these 
subjects,  we  could  get  a  different  model  for  each  subject.  In  fact 
Molenaar  et  al,  have  performed  simulations  in  which  they  had 
different  models  for  each  individual  (pair  wise  one- factor,  two- fact 
model,  etc.  for  each  individual) .  It  turned  out  that  when  a  between- 
subjects  model  was  fitted  to  between- subjects  data  at  any  specific  time 
point,  a  factor  model  with  low  dimensionality  provided  an  excellent  fit 
to  the  data,  even  if  the  majority  of  subjects  had  a  different  latent 
structure. . . .  Thus,  the  mechanism  at  the  level  of  the  individual  are  not 
captured,  not  implied,  and  not  tested  by  between- subjects  analyses 
without  heavy  theoretical  background  assumptions  that  are  not 
simply  available. . . .  And  this  implies  that  the  causal  statement  drawn 


Report  No.  SERC- 2017- TR- 106 


70 


Date  April  30,  2017 


from  such  a  measurement  model  retains  the  [original  assumed 
structural  form].  (Molenaar  et  al  2008)" 


This  is  an  uncertain  notion  that  not  only  are  psychological  measures  latent  as  most  scientific 
logical  programs  assume,  but  that  latency  within  certain  spaces  may  or  may  not  have  any 
definable  linkage  within  the  measure  space  (i.e.  no  'identified'  'field  extension').  This  would 
appear  that  group  phenomena  has  a  linear  independency  from  those  in  the  individual,  but  at 
the  same  time,  physically  the  individuals  are  a  basis  for  the  group  ('group'  does  not  exist 
without  individuals)!?  Realistically  one  can  appreciate  why  this  happens,  but  this  creates 
underlying  problems  when  defining  mapping  to  a  vector  from  these  models. 

Even  then  social  sciences  often  deal  with  then  measurements  that  are  at  the  same  time 
potentially  theory  irrelevant;  what  have  been  termed  'construct  irrelevance'.  So  even  one 
could  think  to  constructs  being  locally  irrelevant  within  a  defined  LVT  model;  e.g.  presumably 
behavior  like  'ticks'  or  un-conscious  behavior  depend  little  on  an  individual's  attention.  It 
should  then  not  surprise  that  several  have  noted  that  any  system  with  a  social  human  becomes 
complex.  This  can  be  seen  in  an  example  model  on  Uber  driver  behavior  as  behavior  can  be 
seen  to  go  'in  and  out'  of  the  assumed  'rational  agent'  model  upon  different  criteria  (Sheldon 
2015).  Again  one  reasons  on  what  underlies  these  transitions,  but  it  is  difficult  to  prescribe  to  a 
particular  model  when  the  underlying  phenomena  presents  a  null  set  to  that  'construct'. 

Rather  one  ordinally  notes  what  variables  excite  one  model  construction  or  the  other.  These 
then  become  difficult  to  know  a  priori  and  again  makes  composition  considerations  difficult, 
and  this  invariably  increases  the  order  of  the  model. 

The  implications  to  this  will  be  addressed  in  other  sections,  yet  the  purpose  here  is  to  establish 
awareness  on  these  underlying  categorical  changes  to  programs  within  the  social  sciences. 

Thus  we  use  this  to  support  our  conjecture  that  categorical  logical  rules  will  need  to  incorporate 
these  'changes  in  abstraction'  necessary  toward  any  programmatic  method.  One  can  also  then 
touch  on  why  there  are  arguable  replicable  patterns  in  social  psychology.  But  with  the  given 
difficulties,  it  should  not  surprise  that  objective  replicability  is  currently  suffering  in  the  area 
(Schooler  2014).  Similarly,  this  leads  to  the  larger  scientific  program  within  social  theory  as  one 
needs  to  investigate  over  categorical  abstractions;  hence  why  there  are  common  dualisms  and 
dialectics  in  social  theory;  behaviorism  vs.  Gestaltian  psychology,  political  economic  'schools  of 
thoughts',  and  'rational  agent'  &  'behavioral  agent'  models  in  economics.  Just  as  quantum 
theory  developed  a  quantum  logic,  social  science  has  their  own  uniqueness  that  requires  a 
logical  system  and  that  where  axioms  on  'social  science'  might  be  better  represented  or  need 
to  be  regularly  exchanged. 

Given  that  the  logical  breaks  happen  in  an  'abstract  algebraic  space'  then  categorical  theory  is  a 
prime  candidate  as  Peter  Smith  notes  "category  theory  gives  us  a  way  of  dealing  with  these 
layers  of  increasing  abstraction.  So  if  modern  mathematics  already  abstracts,  category  theory 
comes  into  its  own  when  one  abstracts  again  and  then  again"  (Smith  2016)  (also  recommended 
is  Awodey  2010).  Also  of  initial  interest  is  those  practitioners  who  note  the  use  on  patterning. 


Report  No.  SERC- 2017- TR- 106 


71 


Date  April  30,  2017 


approximate,  and  agile  orderings  given  its  use  over  algebraic  topology  (here  Cordier  &  Porter 
(2008)  is  a  good  treatise  formed  in  categories).  If  we  are  to  map  (and  thus  get  an  effective 
statistical  signal),  the  dialects  within  the  social  theory  must  be  mapped  across  abstract  spaces 
to  obtain  objects  and  functors  in  which  identify  means  to  analyze  these  within  formal  systems. 
Ideally  one  hopes  to  do  this  in  a  reasoned  manner  which  means  one  must  have  a  logic  in  which 
to  do  so  in  an  objective  manner.  As  of  current  research,  category  theory  is  the  only  known  a 
priori  logic  that  describes  the  abstractions  herein  with  appropriate  power. 

While  topological  considerations  in  social  science  were  encountered  (Kluver  &  Schmidt  1999),  it 
should  be  noted  that  abstract  algebraic  considerations  were  not  sufficiently  found  outside  of 
those  necessary  for  a  particular  analytic  program.  Although  the  abstract  considerations  seem 
to  these  researchers  as  algebraic  in  theory,  the  notions  are  not  currently  formulated  in  abstract 
algebraic  terms  even  against  the  'non-finitist'  schools.  However  there  are  relevant 
considerations  on  category  theory  that  have  shown  increased  attention  within  engineering  and 
particularly  in  computer  sciences.  These  are  explored  in  other  sections,  but  to  mention  the 
potential  in  interoperability  between  these  abstract  social  modeling  and  system  engineering 
methods  should  be  strongly  theoretically  supportable  given  future  research.  Given  the 
increasingly  common  language,  this  presents  an  opportunity  for  merging  the  'social'  and 
'technical'  theory  embedded  in  these  systems. 


5.3  Implications  for  Finding  Counter-Intuitive  Policy  Impacts 

While  the  physical  sciences  and  the  social  sciences  take  very  different  approaches  to  dealing 
with  multiple  ontologies  and  identifying  unexpected  consequences,  at  an  abstract  level,  the 
fundamental  problem  is  the  same.  A  counter-intuitive  or  unexpected  result,  is  by  definition  a 
mismatch.  This  mismatch  can  occur  when  comparing  models  to  each  other  or  comparing 
models  to  data.  When  we  call  a  result  counter-intuitive,  it  is  often  because  the  prediction  of  the 
mathematical  or  computational  model  or  does  not  match  the  prediction  of  a  human's  mental 
model.  When  we  call  a  result  unexpected,  it  is  often  because  the  prediction  of  the  model  does 
not  match  empirical  data.  In  both  cases  there  is  an  issue  with  missing  information. 

If  we  view  a  model  as  compressed  data  and  the  model  is  incorrect,  that  means  that  either 
critical  data  was  missing  at  the  time  of  compression  or  that  data  was  discarded  in  order  to 
achieve  the  compression.  Thus,  if  we  have  an  unexpected  consequence  of  any  sort,  it  means 
that  we  are  missing  information  in  our  model.  If  we  want  to  predict  the  consequence  with  our 
model,  that  information  must  be  put  into  the  model  somehow.  In  the  end,  there  are  only  two 
sources  for  this  information,  empirical  data  or  theory,  and  theory  is  also  compressed  data. 

When  we  consider  the  physical  sciences,  there  are  well  validated  theories  that  seem  to  perform 
well  in  isolation  within  certain  bounds.  However,  when  those  bounds  are  crossed,  there  may  be 
no  obvious  way  to  this.  So  modelers  attempt  to  connect  the  existing  theories  by  using  empirical 
data  to  introduce  the  missing  information.  The  problem  is  that  the  validity  of  the  existing 
theories  does  not  automatically  transfer  to  the  composite  model.  Thus,  the  composed  model,  is 
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in  a  sense,  a  new  theory,  and  the  "tuning"  process  on  establish  a  localized  validity. 
Consequently,  it  is  difficult  to  have  confidence  in  projections  outside  of  range  of  evaluation. 

For  the  social  sciences,  on  the  other  hand,  it  is  difficult  to  even  establish  a  reliable  compression 
of  the  available  data.  So  much  relevant  data  is  dropped  during  the  compression  process,  that  it 
is  difficult  to  develop  stable  models  at  all.  Consequently,  there  is  often  a  proliferation  of 
alternative  theories  and  even  ontologies  in  the  social  science.  However,  this  is  informative  in  its 
own  right.  Each  of  these  alternative  theories  could  be  viewed  as  generators  of  potential 
unintended  consequences  that  have  some  validity.  We  know  that  each  of  the  established 
theories  was  correct  at  least  often  enough  that  it  became  accepted.  Thus,  comparisons  among 
these  alternative  models  are  potential  sources  of  unintended  consequences. 

If  we  are  concerned  about  counter-intuitive  policy  impacts  or  unintended  consequences  of  a 
policy,  this  suggests  that  any  experienced  "unintended  consequences"  were  the  result  of 
information  that  was  omitted  from  analysis.  Sometimes  we  informally  call  these  higher  order 
effects.  But  from  the  perspective  of  this  analysis,  the  consequence  may  have  been  predictable 
had  the  proper  information  been  injected.  Empirical  data  would  be  preferred,  but  this  is  often 
impractical  for  many  policy  analyses.  Consequently,  the  only  other  source  is  theory.  Yet,  as  we 
have  discussed,  for  behavioral  and  social  issues,  there  often  many  possible  alternative  theories. 
But  which  one  is  the  right  one?  As  the  previous  section  found,  we  usually  do  not  know  a  priori. 
Thus,  are  only  option  is  try  to  multiple  different  model  configuration  and  generate  a  spread  of 
scenarios.  Beyond  empirical  data,  this  is  the  only  way  to  "catch"  an  unintended  consequence 
ora  counter-intuitive  result. 

This  leads  us  to  the  conclusion  that  to  have  a  viable  approach  to  detect  unintended 
consequences,  we  must  have  a  systematic  way  to  introduce  alternative  structure  into  a  model. 
In  the  case  of  the  core-peripheral  approach,  the  core  is  effectively  the  first  order  model  that 
links  the  decision  variables  to  the  output  variables  of  interest.  The  peripheral  models  are 
alternative  "theories"  for  how  portions  of  the  enterprise  might  behave.  Thus,  we  need  a  way  to 
systematically  explore  the  space  of  possible  peripheral  models  and  then  integrate  them  with 
the  core.  This  creates  two  technical  challenges.  First,  the  core  and  various  peripheral  models 
may  have  very  different  ontologies.  Even  worse,  these  may  overlap,  meaning  that  they  attempt 
to  represent  the  same  "thing"  in  more  than  one  way.  To  overcome  this,  one  needs  a 
mathematical  understanding  of  the  rules  that  govern  when  and  how  models  with  multi-scale 
ontologies  may  be  integrated.  This  will  be  addressed  in  the  next  section. 

Second,  the  space  of  potential  model  permutations  is  vast.  Since  it  will  not  be  possible  (or  even 
necessary)  to  try  them  all,  what  is  an  appropriate  way  navigate  through  this  space  without  an 
obvious  dimension  to  order  the  models  on?  This  will  be  discussed  in  Section  7 


6  Mathematical  Analysis  of  Multi-scale  Ontologies 
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As  discussed  in  Section  5.1,  multi-scale  and  multi-level  models  have  become  popular  in  the 
physical  sciences  and  engineering.  Implicit  in  these  approaches  is  that  the  models  are  actually 
composable,  yet  it  is  well  known  that  the  composition  of  heterogeneous  models  is  a  non-trivial 
endeavor  (Taylor  et  al  2015).  Mathematical  definitions  of  simulation  interoperability  and 
composability  have  already  been  developed  (Weisel  et  al  2003).  Rather,  the  interest  here  is 
understanding  what  leads  to  the  interoperability  and  composition  issues  in  the  first  place.  The 
Levels  of  Conceptual  Interoperability  Model  (LCIM)  indicates  that  a  lack  of  conceptual 
interoperability  among  models  with  regard  to  the  reference  system  can  cause  such  issues  (Tolk 
&  Muguira  2003,  Wang  et  al  2009).  The  objective  of  this  section  is  to  develop  a  mathematically 
rigorous  explanation  of  what  it  means  to  have  a  lack  of  conceptual  interoperability  among 
models  as  a  consequence  of  the  characteristics  of  the  system  being  modeled  and  the  selected 
abstractions.  The  intent  is  to  provide  a  first  step  to  understanding  and  facilitating  the 
composition  of  models  and  simulations  to  support  the  development  of  multi-level  models  of 
enterprise  systems. 

To  accomplish  this,  Rosen's  (1978)  approach  to  measuring  and  analyzing  systems  using 
commutative  diagrams  over  sets  is  adapted.  This  approach  provides  a  mechanism  with  which  to 
explore  the  underlying  linkage  relationships  among  diverse  systems  views.  The  nature  of  these 
linkage  relationships  impact  the  ability  to  compose  the  associated  models. 

To  that  end,  three  categories  of  linkage  relationships  are  introduced:  unlinked,  state  linked,  and 
transition  linked.  Examination  of  the  multiscale  physics  modeling  literature  provides  insights  as 
to  how  these  categories  are  addressed  in  practice.  The  outcomes  of  this  analysis  are  a 
definition  of  conceptual  interoperability  as  a  lack  of  transition  linkages  across  models  and  a  set 
of  four  hypothesized  sources  of  transition  linkages  among  composed  models. 


6.1  Composite  Models  in  Engineering 

Models  have  long  been  used  to  support  engineering  decision  making.  However,  one  of  the 
recurring  themes  of  systems  engineering  is  that  multiple  perspectives  and  hence  multiple 
models  are  necessary  to  understand  a  real  world  system.  This  viewpoint  is  evident  in 
architecture  frameworks  such  as  Zachman  (1987)  and  DoDAF  as  well  as  markup  languages  such 
as  SysML  and  IDEF  that  explicitly  facilitate  the  conceptual  linkage  among  diverse  system  views. 
The  logical  consequence  is  that  to  support  systems  engineering  decision  making,  one  needs  to 
compose  models  from  multiple  perspectives.  The  formalization  of  this  principle  is  known  as 
Model  Based  Systems  Engineering.  Dickerson  and  Mavris  (2013)  provide  a  detailed  history  of 
the  evolution  of  MBSE  and  the  formal  mathematical  foundations  of  system  design. 

While  such  approaches  allow  for  the  computational  exploration  of  trade  spaces  by  propagating 
high-level  changes  down  to  the  physics-based  level,  it  is  also  necessary  to  propagate  low-level 
impacts  back  up.  While  the  latter  can  be  accomplished  through  empirical  testing,  that  can  be  an 
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expensive  and  time  consuming  approach.  Unfortunately,  accomplishing  this  computationally 
has  been  challenging  to  do  in  a  comprehensive  way. 

The  rapid  increase  in  available  computational  power  over  the  last  several  decades  combined 
with  growing  inventories  of  computational  engineering  models  and  simulations  have  led  many 
to  wonder  if  we  could  accomplish  the  ideal  of  a  comprehensive  tradespace  exploration  upfront 
by  computationally  connecting  existing  or  adapted  models.  In  principle,  this  composition 
achieves  two  benefits:  First,  it  would  allow  for  the  computational  exploration  of  trade  spaces  by 
propagating  high-level  changes  down  to  the  physics-based  level  and  propagate  low-level 
impacts  back  up.  Second,  it  would  facilitate  tracing  impacts  across  diverse  system  viewpoints 
such  as  the  cost  view,  functional  view,  etc.  Some  have  termed  the  comprehensive  use  of 
integrated  engineering  models  throughout  the  system  life-cycle  Model  Centric  Engineering 
(MCE).  Regardless  of  the  name,  a  successful  composition  of  independent  models  is  required. 

The  idea  of  computationally  composing  existing  engineering  models  from  multiple  perspectives 
to  assess  system  designs  is  not  new.  One  example  is  Multi-Disciplinary  Optimization  (MDO)  in 
aerospace  engineering  (Yao  et  al  2011).  There  have  also  been  a  number  of  attempts  to  develop 
general  model  composition  frameworks  in  recent  years.  We  will  briefly  mention  three.  First,  the 
most  well-known  is  the  IEEE  standard  High  Level  Architecture  (HLA)  (IEEE  1516)  (IEEE  2010). 

HLA  provides  a  generic  framework  for  federating  multiple  simulations  via  coordinated 
execution  and  data  exchange.  Second,  SPLASH  is  a  framework  developed  by  IBM  Research  for 
loosely  coupling  models  from  different  domains  using  a  description  language  called  SADL 
(Barberis  et  al  2012).  Third,  the  Dynamic  Multilevel  Modeling  Framework  (DMMF)  was  an  effort 
by  the  US  Department  of  Defense  to  compose  existing  simulations  across  four  levels:  campaign, 
mission,  engagement,  and  engineering  to  support  system  design  and  acquisition  though  it  was 
ultimately  dropped  due  to  infeasibility  (Mullen  2013). 

As  far  as  actually  integrating  composite  models  into  the  system  engineering  process,  two 
efforts  bear  mentioning.  First,  OpenMETA  is  an  integrated  tool  suite  that  was  developed  as  part 
of  the  DARPA  Adaptive  Vehicle  Make  program  (Sztipanovits  et  al  2014;  Sztipanovits  et  al  2015). 
It  allows  one  to  reuse  and  compose  existing  engineering  tools  to  design  cyber-physical  systems. 
The  objective  is  to  achieve  a  "correct  by  construction"  design  and  avoid  late  redesign.  Second 
NASA's  Jet  Propulsion  Lab  has  an  Integrated  Model-Centric  Engineering  (IMCE)  initiative  that 
aims  to  better  integrate  engineering  models  across  multiple  disciplines  into  the  systems 
engineering  process  for  its  space  science  missions  (Bayer  et  al  2011).  Anticipated  benefits 
include  increased  reuse  of  existing  engineering  solutions,  continuous  verification  and  validation, 
and  more  rapid  exploration  of  the  design  tradespace. 

Friedman  and  Leondes  (1969a, b,c)  recognized  the  challenges  of  assessing  internal  consistency 
across  multiple  system  models  and  developed  constraint  theory  to  do  so.  More  recently,  the 
National  Science  Foundation  held  a  workshop  to  identify  research  challenges  to  using  modeling 
and  simulation  to  engineer  complex  systems.  As  the  workshop  report  notes,  "The  reuse  of 
models  is  confounded,  however,  by  the  fact  that  they  are  peculiarly  fragile  in  a  certain  sense  - 
they  are  typically  context-sensitive,  highly  purposeful  abstractions  and  simplifications  of  a 
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perception  of  a  reality  that  has  been  shaped  under  a  possibly  unknown  set  of  physical,  legal, 
cognitive  and  other  kinds  of  constraints  by  a  modeler,  or  modeling  team;  quite  often  a  model's 
function  is  sensitive  to  many  unstated  assumptions.  The  end  result  is  that  model  reuse  can  be 
fraught  with  significantly  more  complexity  than,  say,  reusing  the  implementation  of  a  sorting 
routine."  (Fujimoto  2016) 

Consistent  with  the  above  observation,  frameworks  have  also  been  developed  for  domain 
specific  composition  of  simulations.  A  recent  example  in  the  area  of  infrastructure  modeling  is 
provided  by  Grogan  and  de  Week  (2015).  Another  in  the  area  of  modeling  logistics  systems  is 
provided  by  Sprock  and  McGinnis  (2014). 

Several  questions  naturally  follow: 

•  Why  does  the  computational  composition  of  engineering  models  work  well  in  some 
circumstances  but  not  others? 

•  Why  does  being  domain  focused  seem  to  improve  the  chances  of  success? 

•  Are  there  any  indicators  that  would  let  one  know  when  composition  is  feasible  to 
attempt? 

•  Are  there  standards  or  approaches  to  model  design  that  would  facilitate  future 
composition? 

These  questions  have  certainly  been  asked  before.  Those  who  have  experience  building 
composite  engineering  simulations  probably  have  intuitive  answers  for  them.  The  objective  of 
this  analysis  is  to  develop  a  mathematical  description  to  make  certain  aspects  of  that  intuition 
precise.  In  particular,  we  wish  to  consider  how  conceptual  interoperability  or  lack  thereof 
among  heterogeneous  system  models  affects  the  composability  of  said  models.  The  intent  of 
the  mathematical  description  is  to  serve  as  a  mechanism  to  frame  hypotheses  regarding  the 
above  questions. 


6.2  Approach 

Developing  a  mathematical  description  of  the  role  of  conceptually  interoperability  in  model 
composition  is  tantamount  to  modeling  modeling.  While  there  is  an  entire  branch  of 
mathematics  called  model  theory,  it  is  concerned  with  the  concept  of  modeling  in  general. 
However,  here  we  are  concerned  with  some  very  specific  questions: 

•  What  does  it  mean  mathematically  to  model  a  system  from  multiple  perspectives? 

•  What  conditions  does  a  successful  composition  of  multiple  heterogeneous  models  imply 
with  regard  to  the  models  and  the  system  of  interest? 

•  What  attributes  of  the  models  or  the  system  of  interest  would  cause  a  composition  to 
fail? 
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•  How  have  those  causes  been  addressed  in  the  past  if  at  all? 

•  What  are  the  implications  for  MBSE  and  MCE? 

•  How  might  the  resulting  challenges  be  mitigated? 


Consequently,  the  author  chose  to  adapt  Rosen's  approach  to  modeling  systems  as  developed 
in  the  monograph  "Fundamentals  of  Measurement  and  Representation  of  Natural  Systems" 
(Rosen  1978).  Rosen's  concern  was  how  to  measure  and  model  natural  systems  and  the 
associated  implications  for  physics  and  biology.  More  specifically,  he  was  interested  in  the 
interrelationships  among  different  perspectives  of  a  system.  Thus,  Rosen's  work  provides  an 
appropriate  set  of  mathematical  building  blocks  to  explore  the  above  questions. 

The  investigative  approach  taken  is  as  follows: 

•  Adapt  Rosen's  work  to  describe  mathematically  what  it  means  to  model  a  system  from 
multiple  perspectives 

•  Extend  that  description  to  define  the  conditions  for  the  successful  composition  of 
multiple  models 

•  Analyze  the  description  to  identify  potential  deviations  from  these  conditions 

•  Analyze  the  description  to  see  how  the  deviations  might  be  addressed 

•  Compare  the  results  to  findings  from  multi-scale  physics  modeling 

•  Draw  inferences  about  the  implications  for  engineering  modeling 

•  Define  hypotheses  and  research  questions  for  future  investigation 


6.3  A  Mathematical  Explanation  of  Modeling  a  System  from  Different  Perspectives 

The  goal  of  this  section  is  to  express  very  precisely  what  it  means  to  model  a  system  from 
different  perspectives.  This  is  accomplished  by  adapting  the  work  of  Rosen  (1978).  Rosen  used  a 
combination  of  equivalence  relations  and  commutative  diagrams  over  sets  to  explore 
relationships  among  multiple  views  of  a  system13.  We  consider  the  following  topics  in 
sequence: 

•  What  does  it  mean  to  view  a  system  from  different  perspectives? 


13  The  ideas  presented  in  this  subsection  are  attributable  to  Robert  Rosen.  However,  Rosen's  original  presentation 
is  very  abstract  with  few  explanatory  examples.  The  researcher's  contribution  is  a  tailored  summary  and 
explanation  of  those  ideas  in  the  context  of  engineering  modeling.  The  author  attempted  to  maintain  as  much 
consistency  as  possible  with  Rosen's  notation  in  order  to  facilitate  comparison  with  his  work.  However,  some 
departures  from  his  notation  were  unavoidable  due  to  differences  in  focus. 


Report  No.  SERC- 2017- TR- 106 


77 


Date  April  30,  2017 


•  What  are  the  relationships  among  these  perspectives? 

•  How  do  we  model  a  system  from  a  given  perspective? 

•  What  are  the  relationships  among  models  of  different  perspectives? 

To  facilitate  the  discussion,  we  will  introduce  a  very  simple  example  from  basic  physics  and 
revisit  it  throughout.  Imagine  a  simple,  one-dimensional  universe  that  contains  only  two 
massive  bodies  whose  attraction  is  governed  by  Newton's  law  of  gravity  F  —  Gm1m2/r2.  As 
the  different  components  of  Rosen's  framework  are  introduced,  we  will  consider  what  they 
mean  in  terms  of  this  example. 


6.3.1  What  does  it  mean  to  view  a  system  from  different  perspectives? 

Rosen  starts  with  the  assumption  that  a  system  is  defined  by  a  set  of  states,  5.  How  do  we  know 
what  elements  make  up  S?  According  to  Rosen,  we  do  not  know.  The  best  we  can  do  is  measure 
observables  and  make  inferences  about  5.  In  terms  of  our  simple  example  with  two  bodies, 
observables  would  include  their  positions,  temperatures,  masses,  and  so  on.  Mathematically, 
observables  are  functions  that  map  the  state  space,  5,  to  another  set  such  as  the  real  numbers. 

A  given  observable,/,  generates  an  equivalence  relation  Rf  on  5.  That  means  that  any  two 
states  s,  s'  £  S  belong  to  the  same  equivalence  class  if  (s)  =  /(s').  As  a  result,  we  will  be 
unable  to  discern  differences  in  state  that  occur  within  the  same  equivalence  class  using  only 
the  observable/.  For  example,  if  we  only  measure  the  positions  of  our  two  bodies,  we  are  not 
able  to  differentiate  among  system  states  that  have  the  same  positions  but  different 
temperatures. 

Of  course,  we  can  measure  more  than  one  observable.  A  set  of  observables,  F,  generates  an 
equivalence  relation,  RF,  on  5.  For  example,  if  F  consists  of  both  position  and  temperature, 
under  RF  all  states  of  S  that  have  the  same  temperatures  and  positions  would  be  viewed  as 
equivalent.  The  quotient  set  S/RF  is  the  reduced  set  of  system  states  that  result  from  the  set  of 
observables,  F.  It  is  a  partition  of  the  set  5.  For  our  simple  example,  we  have  reduced  the  set  of 
states  to  a  set  of  vectors  of  positions  and  temperatures. 

It  is  important  to  note  that  the  reduced  state  space  of  the  system,  S /RF,  is  a  consequence  of 
which  observables  are  collected.  Thus,  each  set  of  observables  constitutes  an  abstraction  of  the 
system.  This  provides  a  precise  way  to  express  what  is  meant  by  viewing  a  system  from  a 
particular  perspective.  A  perspective  is  the  quotient  set  generated  by  the  collection  of 
observables  applied  to  a  system. 

Beyond  understanding  the  current  state  of  the  system,  it  is  also  of  interest  to  understand  how 
the  system  changes  states  over  time.  The  chosen  collection  of  observables  also  affects  what 
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state  transitions  we  can  discriminate.  Rosen  defined  changes  in  the  state  of  the  system  as  an 
automorphism  on  5. 

Let  Tbe  an  automorphism  on  S.  If  7” is  compatible  with  RF,  then  ^induces  an  automorphism  on 
the  reduced  set  of  states ,  S /RF.  Let  us  call  this  automorphism  TF.  This  is  a  description  of  the 
state  transitions  for  reduced  set  of  states  S/RF.  Introducing  the  composition  operator 
generates  a  group  of  automorphisms  from  TF  that  can  be  used  to  define  trajectories  in  the 
reduced  state  space.  Indexing  the  resulting  elements  of  the  group  by  t  G  TL  or  t  G  M  describes 
changes  in  system  state  versus  time.  For  our  two  body  example,  repeated  applications  of  TF 
would  describe  how  the  positions  of  the  two  bodies  change  over  time.  We  call  this  the  system's 
dynamics. 

How  can  we  determine  TF?  Again,  we  cannot  do  this  directly.  We  can  only  infer  it.  To 
complicate  things  further,  observables  are  not  measured  directly.  Rather,  specially  configured 
systems  called  meters  are  used.  Meters  are  designed  to  dynamically  interact  with  the  system  of 
interest  and  asymptotically  approach  a  value  taken  to  be  the  measurement  of  the  observable. 
An  example  would  be  using  a  thermometer  to  measure  temperature. 

Assume  that  MF  is  the  meter  that  measures  the  set  of  observables  F.  To  understand  TF,  we  take 
successive  measurements  using  the  meter  MF  and  try  to  infer  TF.  This  situation  is  expressed  by 
Equation  1. 

S/BF  ■  Tf-  >  S/BF 


mf  mf 


Equation  1 


This  setup  allows  us  to  express  the  impact  of  an  abstraction  defined  by  a  collection  of 
observables  F  on  the  perceived  dynamics  of  the  system.  If,  T  is  compatible  with  S /RF  then  TF  is 
a  bijection  and  the  dynamics  is  deterministic  and  reversible.  However,  since  S/RF  reduces  the 
set  of  states,  there  is  no  guarantee  that  elements  of  T  will  be  compatible  with  S /RF.  For  many 
realistic  problems,  it  will  not  be.  The  result  is  that  TF  will  split  equivalence  classes  of  RF.  This 
situation  enables  us  to  discriminate  among  more  states  of  S  then  we  could  with  F  alone,  but  it 
also  makes  the  system  appear  stochastic  and/or  irreversible.  Since,  we  often  encounter  such 
situations  in  real  life,  we  will  only  require  TF  to  be  an  endomorphism  as  opposed  to  an 
automorphism  for  the  remainder  of  this  paper. 


6.3.2  What  are  the  relationships  among  multiple  perspectives  of  a  system? 
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Extending  the  idea  that  an  abstraction  of  a  system  is  determined  by  a  collection  of  observables, 
we  ask  how  we  can  precisely  define  relationships  among  multiple  abstractions  of  the  same 
system.  These  relationships  are  known  as  linkages  and  they  can  be  defined  by  which 
combinations  of  equivalence  classes  from  each  of  the  perspectives  are  allowable. 

Assume  that  a  system  can  be  described  by  two  observables  /(s)  and  g(s).  Each  generates  an 
equivalence  relation,  Rf  and  Rg  respectively.  If  both  are  applied  at  the  same  time,  the  result  is 
the  equivalence  relation,  Rfg.  What  is  the  relationship  among  these  three  equivalence 
relations?  If  every  class  of  Rf  intersects  every  class  of  Rg,  and  vice  versa,  then  the  observables/ 
and  g  are  completely  unlinked.  That  means  that  knowing  the  value  of  one  observable  provides 
no  information  on  the  value  of  the  other.  In  other  words,  the  reduced  state  space  of  S/Rfg  is 
the  Cartesian  product  of  the  reduced  state  spaces  generated  by/and  g. 

S/Rfg  S/Rf  x  S/Rg 

On  the  other  hand,  if  every  class  of  Rf  intersects  exactly  one  class  of  Rg,  and  vice  versa,  then 
the  observables/and  g  are  completely  linked.  Knowing  the  value  of  one  observable 

determines  the  value  of  the  other.  This  substantially  reduces  the  possible  state  space  as  now 

S/Rfg  c  S/Rf  x  S/Rg. 

Returning  to  the  two-body  example,  imagine  that  the  two  bodies  are  widely  separated.  If  the 
observables  of  interest  are  the  positions  of  each  body,  then  the  two  positions  are  unlinked. 
Setting  the  position  of  one  body  does  not  restrict  the  set  of  possible  positions  of  the  other. 

Now,  assume  that  we  also  want  to  measure  two  more  observables:  the  temperature  of  each 
body  and  the  peak  wavelength  of  electromagnetic  radiation  emitted  by  each  body.  These  two 
observables  are  linked  because  only  certain  combinations  of  equivalence  classes  are  allowable. 
For  example,  if  the  temperature  of  one  of  the  bodies  is  290K,  the  peak  wavelength  cannot  be  in 
the  ultraviolet  range.  For  a  perfect  black  body,  the  linkage  relationship  is  described  by  Planck's 
Law.  One  could  argue  that  most,  if  not  all,  scientific  laws  are  descriptions  of  linkage 
relationships. 

This  concept  will  be  important  when  considering  how  to  model  a  system.  A  linkage  relationship 
can  also  be  viewed  as  a  symmetry  that  allows  one  to  compress  the  state  description  of  the 
system.  Consequently,  for  an  abstraction  to  be  useful,  it  should  consist  of  a  set  of  observables 
that  are  related  by  linkage  relationships.  Or,  to  put  it  another  way,  what  would  be  the  benefit 
of  including  unlinked  observables  in  the  same  abstraction?  For  example,  it  is  useful  to  include 
body  temperature  and  the  intensity  of  emitted  radiation  at  each  wavelength  in  the  same 
abstraction.  One  can  use  the  linkage  relationship  to  build  an  infrared  thermometer  for  instance. 
But  there  would  be  little  use  to  including  an  unlinked  observable  like  position  in  that 
abstraction. 
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More  generally,  two  or  more  observables  may  be  partially  linked  where  knowledge  of  the  value 
of  one  observable  provides  incomplete  information  on  the  state  of  another.  For  example,  the 
object  might  not  be  a  perfect  black  body.  Linkage  relationships  may  also  involve  more  than  two 
observables.  An  example  of  this  would  be  Ohm's  law  (V=IR)  which  assumes  a  complete  linkage 
among  the  observables  voltage,  current,  and  resistance  in  an  electrical  circuit.  Knowing  values 
of  two  of  the  observables  enables  us  to  determine  the  third.  Importantly,  the  strength  of  a 
linkage  relationship  among  observables  may  vary  over  different  subsets  of  the  state  space,  5. 
This  is  true  of  most  if  not  all  of  the  scientific  laws  observed  to  date.  Thus,  one  must  always 
specify  when  a  given  law  or  symmetry  relationship  does  and  does  not  apply. 


6.3.3  HOW  DO  WE  MODEL  A  PERSPECTIVE  OF  A  SYSTEM? 

Symmetry  relationships  also  enable  us  to  build  models  of  a  system.  As  a  term,  model  has  many 
different  uses  in  many  different  contexts.  Consequently,  we  must  define  what  we  mean  by 
model  in  the  context  of  this  discussion  acknowledging  that  this  definition  is  not  universal.  In  the 
discussion  that  follows,  we  will  limit  the  scope  to  models  that  we  use  for  prediction  as  that  is 
chiefly  the  motivation  behind  the  model  composition  efforts  in  engineering. 

In  short,  prediction  is  the  ability  to  determine  the  state  of  a  system  of  interest  under 
circumstances  not  experienced  including  different  times,  locations,  and  contexts.  One  way  this 
could  be  accomplished  is  with  a  complete  description  of  all  possible  state  transitions  for  a 
system  of  interest.  In  terms  of  the  setup  developed  in  the  previous  section,  this  would  be  the 
automorphism  that  generates  the  dynamics  of  the  system. 

s  ^  s 

There  are  two  problems  here.  First,  we  do  not  know  what  5  is  as  we  interact  with  it  indirectly 
via  meters.  Second,  even  if  we  knew  what  S  was,  for  any  non-trivial  system,  determining  all  the 
state  mappings  is  effectively  impossible  since  one  will  not  or  cannot  experience  all  possible 
states  s  E  S.  So  what  are  we  left  with?  As  discussed  in  the  previous  section,  we  can  achieve  a 
reduced  description  of  the  state  space  S  through  observables.  So  the  next  best  thing  is  if  we 
could  identify  an  endomorphism  (TF)  over  the  reduced  state  space  for  a  set  of  observables,  F, 
that  we  are  interested  in. 

S/Rf  ?4  S/Rf 

The  objective  is  to  infer  the  dynamics  of  the  reduced  set  of  states  of  the  system  by  taking 
successive  readings  with  meters.  Again  we  are  faced  with  the  problem  that  predicting  the 
future  state  of  a  system  of  interest  involves  explicitly  knowing  all  possible  state  transitions  for 
the  reduced  state  space. 

One  way  to  address  this  problem  is  to  find  a  relationship  among  the  observables  that  is 
invariant  over  the  dynamics.  A  symmetry  relationship  fits  this  requirement.  A  symmetry  allows 
one  to  compress  the  mapping  by  dropping  redundant  relationships.  They  can  be  reconstructed 
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from  the  symmetry  relationship  when  needed.  By  selectively  applying  these  symmetry 
relationships,  one  can  build  a  new  system  (physical  or  mathematical)  that  can  serve  as  a 
compressed  representation  of  the  target  system's  behavior.  We  call  this  new  system  a  model 
for  the  target  system.  In  other  words,  the  symmetry  relationships  are  used  to  reconstruct  the 
target  system's  dynamics  on  demand  via  the  execution  of  experiments  for  physical  models  or 
computation  for  mathematical  models. 

In  order  to  make  this  concept  precise,  we  need  to  introduce  a  new  set  of  states  for  the  system 
we  are  calling  our  model  of  (5 /RF ,  TF ).  Let  X  be  the  set  of  states  of  this  new  system  with  a 
corresponding  set  of  allowable  state  transitions,  Dx.  For  the  system  ( X ,  Dx ),  to  be  a  model  of 
the  dynamics  of  abstraction  of  the  system  S /RF,  Equation  2  must  commute. 

S/BF  Tf  >  S/Bf 


X  X 

Equation  2 


a 


In  essence,  what  this  diagram  asserts  is  that  if  we  measure  the  observables  of  interest  on  the 
system,  encode  them  into  the  state  space,  X,  of  the  model  via  the  mapping  a,  and  propagate 
the  model  state  forward  using  the  mapping,  Dx,  we  will  get  the  exact  same  result  as  if  we 
measure  the  system  at  the  later  time  and  mapped  it  into  the  model  via  a.  This  definition  is 
quite  general.  ( X ,  Dx )  could  represent  a  physical  analog  or  a  mathematical  model.  If  this 
diagram  commutes,  repeated  applications  of  Dx  given  a  particular  starting  state  yields  the 
predicted  trajectory  of  the  system  through  the  state  space. 

More  precisely,  X,  is  the  encoding  via  the  mapping  a  of  a  subset  of  the  state  space 
Y[fieFS/Rfi.  This  reduction  is  achievable  because  of  the  identified  linkage  relationships  among 
the  variables.  In  the  case  of  a  mathematical  model,  X  captures  the  equation  of  state.  We  should 
note  that  any  observables  in  the  original  set,  F,  that  are  completely  unlinked  with  the 
observables  of  interest  are  typically  omitted.  Mathematically,  this  is  equivalent  to  replacing 
these  with  constant  observables.  In  the  case  that  we  also  restrict  the  state  space,  5,  such  that  it 
falls  entirely  within  a  single  equivalence  class  of  another  observable,  that  observable  can  be 
viewed  as  a  parameter  of  the  model. 

The  interpretation  of  Dx,  depends  on  whether  X  is  a  physical  analog  of  the  target  system  or  a 
mathematical  model.  In  the  case  of  the  former,  we  induce  some  physical  analog  of  the 
dynamics.  An  example  would  be  testing  a  model  aircraft  in  a  wind  tunnel.  In  the  case,  of  the 
latter  Dx  takes  the  form  of  computation,  which  could  be  solving  an  analytical  model  or  running 
a  simulation. 
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Returning  to  our  two-body  system,  one  example  of  a  model  of  the  positions  of  this  system  over 
time  could  be  (Xj,  dxj dt)  where  xx  and  x2  are  the  positions  of  the  two  bodies.  If  the  two 
bodies  are  far  enough  apart,  we  can  treat  the  gravitational  force  as  negligible  and  model  the 
state  transitions  of  the  two  bodies  independently  using  (xi(  dxj dt)  where  xx  and  x2  are  the 
positions  of  the  two  bodies.  If  this  is  a  valid  model,  then  we  should  expect  our  predictions  of 
positions  at  future  times  generated  with  our  mathematical  model  to  match  the  measurements 
taken  on  the  real  system.  That  is  effectively  what  Equation  2  asserts. 

The  reader  may  note  that  if  the  dynamics  of  the  system  is  stochastic,  (i.e.,  TF  is  not  one-to-one), 
then  the  diagram  will  not  commute.  Of  course,  if  the  diagram  does  not  commute,  then  ( X ,  Dx ) 
is  not  particularly  useful  as  a  model.  This  is  addressed  in  stochastic  models  by  treating  the 
observables  exhibiting  stochastic  behavior  as  probability  distributions.  In  other  words,  the 
observable  of  interest  is  converted  from  a  point  value  on  the  real  number  line  to  a  function. 

This  restores  the  commutativity  of  the  diagram  and  makes  the  model  deterministic  over  this 
adjusted  set  of  observables.  For  example,  if  a  weather  model  predicts  temperature,  one  would 
want  the  model  to  generate  the  same  distributions  of  temperatures  as  is  observed  in  the  real 
weather  system  of  interest.  Another  example  is  quantum  mechanics.  The  propagation  of  the 
wave  function  is  completely  deterministic.  It  is  the  specific  point  measurement  that  is 
probabilistic.  This  is  known  as  collapsing  the  wave  function. 


6.3.4  What  are  the  relationships  among  models  of  different  perspectives? 

Most  models  of  real  world  systems  are  composites.  Why?  The  scientific  laws  we  work  with, 
whether  Newton's  Laws  or  the  law  of  one  price,  are  only  applicable  under  a  specific  set  of 
circumstances  or  assumptions.  For  example,  Newton's  law  of  gravity  determines  the  strength  of 
the  gravitational  force  between  two  point  masses.  What  happens  if  there  are  more  than  two 
point  masses?  The  presumption  is  that  we  can  reduce  the  system  to  pieces  where  the  law  or 
symmetry  relationship  applies,  then  put  the  pieces  back  together  again  to  obtain  the  behavior 
of  the  whole  system.  This  is  essentially  the  definition  of  reductionism. 

In  terms  of  our  setup,  this  means  we  break  the  observables  up  into  groups  and  work  with  the 
groups  separately.  In  the  two-body  example,  the  positions  of  the  two  bodies  are  completely 
unlinked  if  they  are  far  enough  apart  that  gravitational  attraction  is  negligible.  Thus,  the 
trajectories  can  be  generated  separately  while  still  obtaining  the  correct  position  of  each  body. 
The  model  is  technically  a  composite,  but  the  composition  is  fairly  straightforward. 

Of  course,  this  is  not  generally  the  case,  which  is  why  most  modeling  is  a  little  more 
complicated  than  this.  As  explained  in  the  previous  section,  modeling  a  subset  of  the 
observables  implies  that  the  omitted  observables  are  constant.  If  these  omitted  observables  are 
unlinked  with  those  retained  in  the  model,  then  it  is  not  a  problem.  However,  if  the  omitted 
observables  are  not  totally  unlinked,  the  composition  of  the  partial  models  yields  a  state  space 
that  does  not  completely  correspond  with  the  state  space  of  the  real  system.  Mathematically, 
the  state  space  of  the  composed  model  will  be  larger  than  that  of  the  real  system.  For  example. 
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for  two  sets  of  observables  Fand  G,  the  state  space  of  the  composed  model  is  S/RF  x  S/RG,  of 
which  S/Rfg  is  only  a  subset  (Rosen  1978).  We  have  no  way  to  know  which  states  of  this 
enlarged  space  are  real  and  which  are  artifacts  of  the  composite  model. 

For  the  two-body  example,  there  are  two  additional  cases  of  interest.  1.  the  bodies  are  close 
enough  that  gravitational  attraction  matters,  and  2.  the  bodies  are  colliding.  Both  involve 
linkage  relationships,  but,  as  a  practical  matter,  each  is  handled  differently.  In  the  first  case,  we 
can  compute  the  instantaneous  acceleration  due  to  gravity  and  propagate  the  system  over 
small  time  steps.  In  the  second,  we  must  find  a  simultaneous  solution  for  multiple  symmetry 
relationships  including  conservation  of  momentum  and  conservation  of  energy.  Rosen  makes 
no  distinction  among  such  cases,  as  his  interest  was  exploring  relationships  between  physics 
and  biology.  However,  for  engineering  modeling,  the  distinction  matters.  Consequently,  the 
next  section  will  explore  the  differences  in  more  depth. 


6.4  Analysis  of  Model  Composition 

With  the  basic  mathematical  machinery  in  place,  we  now  consider  the  implications  of 
composing  multiple  models,  each  based  on  a  different  abstraction.  More  specifically,  if  a 
composition  is  successful,  what  does  it  imply  about  the  system  itself  and  the  models  that 
describe  it?  We  will  start  by  extending  the  definition  of  a  valid  model  from  the  previous  section 
to  accommodate  a  composite  model  and  identify  the  implied  conditions.  We  then  consider  the 
impacts  of  violating  those  conditions  on  achieving  a  successful  composite  model. 

Since  we  are  considering  composing  multiple  models  based  on  different  abstractions,  we  need 
to  define  each  model.  First,  partition  the  set  of  observables  F  into  n  subsets  Gt.  Applying  any 
one  set  of  observables  to  the  system  yields  the  abstraction  S/RG..  To  capture  the  dynamics 
under  this  subset  of  observables,  we  need  to  project  the  dynamics  TF  into  the  subspace  S/RG.. 
We  will  call  this  projection  TG..  This  results  in  the  reduced  description  of  the  system 
(S/Rg.  ,Tg).  Consistent  with  the  previous  section,  a  model  of  the  reduced  description  is 
(XoD-y 

Imposing  the  condition  that  the  diagram  in  Equation  2  must  commute  for  ( X ,  Dx )  to  be  a  model 
of  the  system  (5 /RF  ,TF),  then  model  composition  can  be  viewed  as  the  situation 
where  (S/RF  ,Tp)  has  been  projected  into  multiple  subspaces  (5 /RGi,TGJ  which  are  each 
modeled  individually  as  (Xi,  Di)  then  composing  those  models  to  yield  ( X ,  Dx )  while  preserving 
the  commutativity  of  the  diagram.  For  two  models.  Equation  3  must  commute. 
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Note  that  this  diagram  is  a  modification  of  Equation  2.  The  difference  is  that  there  are  two 
models  operating  in  parallel.  One  can  take  the  set  of  observables,  F,  project  it  into  two  different 
views  of  the  system  using  the  subsets  of  observables  Gx  and  Gx  via  the  natural  projections  n1 
and  7t2,  model  and  propagate  each  view  separately  and  still  yield  the  same  result  as  measuring 
the  state  of  the  system  again  at  the  later  time.  For  the  remainder  of  this  paper,  we  will  limit  the 
commutative  diagram  to  two  abstractions,  but  it  should  be  obvious  how  they  could  be 
extended  to  include  models  of  more  than  two  abstractions. 

The  first  observation  that  we  will  make  is  that  for  this  diagram  to  commute,  the  abstractions 
S/RGl  and  S/Rq2  must  be  unlinked.  If  there  are  any  linkage  relationships  among  the 
observables  then  there  are  restrictions  on  the  allowable  states  or  state  transitions  that  are  not 
captured  in  one  or  both  models.  Consequently,  the  combination  of  the  models  and 

(X2,D2)  could  achieve  a  combination  of  states  not  allowable  in  S/RF. 

To  make  this  more  concrete,  consider  the  simple  two-body  example.  For  the  case  where  the 
two  bodies  are  widely  separated  and  gravity  is  negligible,  the  dynamics  of  each  body  can  be 
modeled  independently,  because  the  position  of  one  has  no  impact  on  the  position  of  the 
other.  They  are  unlinked  and  the  models  would  satisfy  the  Equation  3.  Flowever,  if  the  bodies 
are  close  enough  that  gravity  is  a  factor,  then  these  independent  models  are  no  longer  valid. 
Gravity  creates  a  linkage  relationship  among  the  otherwise  independent  projections.  Running 
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the  models  independently  would  result  in  a  combination  of  states  that  is  not  achievable  in  the 
real  world.  Equation  3  would  not  commute. 

Is  there  a  way  to  accommodate  this  linkage  relationship?  Consider  how  one  might  model  the 
two-body  problem  with  gravity.  As  At  ->  0,  the  state  transition  is  governed  by  the 
instantaneous  acceleration  due  to  gravity, 

d2Xj  Gnij 

dt 2  (xi—Xj)2 

Thus,  the  state  transition  for  body  /  is  determined  by  a  combination  of  state  information  from 
both  bodies  (xi(  Xj,  Xj,  vt )  -»  (xi(  vt).  The  mass  and  position  of  the  other  body  are  effectively 
parameters  in  the  model.  As  a  result  the  trajectories  of  the  two  bodies  can  be  modeled  using 
two  different  models  as  long  as  state  information  is  exchanged  between  the  two  at  short  time 
intervals. 

Equation  3  will  not  commute  because  Dx  and  D2  will  not  be  functions.  Since  any  given  x  E  Xt 
does  not  uniquely  determine  the  subsequent  state,  Z^(x)  may  map  to  more  than  one  future 
state.  In  essence,  the  state  information  from  the  other  model  serves  as  parameters  for  Dt.  So 
while  one  cannot  run  truly  independent  models,  the  state  transitions  can  still  be  computed 
independently  as  long  as  state  information  is  coordinated.  More  formally.  Equation  4  must 
commute. 


S/Bf 


QL 


(X1,x2) 


D 1 


D , 


XX 


a  i 


S/B0t 


X2 

«2 

S/BG2 


7Ti  7T2 

S/Bf  <- 


Tf 


Equation  4 


Note  that  the  major  difference  between  Equation  4  and  Equation  3  is  that  the  parallel  paths  for 
encoding  the  model  state  information  have  been  collapsed  into  a  single  path.  Since  the  states 
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of  one  abstraction  serve  as  parameters  for  the  state  transition  for  the  other,  it  is  no  longer  valid 
to  project  the  system  into  separate  subspaces  then  map  to  the  model  states  as  the  necessary 
parameter  values  would  be  lost.  However,  that  information  is  not  required  when  checking  the 
correspondence  of  the  models  with  the  true  system  after  state  transition.  That  is  why  the 
bottom  of  the  diagram  remains  the  same,  and  we  are  able  to  maintain  separate  D£' s.  That 
allows  one  to  have  separate  dynamic  models  for  each  subset  of  observables.  In  this  case,  we 
will  call  the  models  state  linked  because  they  must  exchange  state  information. 

However,  we  should  note  that  Equation  4  implies  that  all  combinations  of  states  from  S/RGi 
and  S/Rq2  are  still  allowable.  For  example,  setting  the  position  for  one  body  does  not 
intrinsically  restrict  the  set  of  possible  positions  where  we  can  place  the  second  body.  If  there 
are  linkage  relationships  among  the  observables  of  the  two  abstractions  that  limit  the  allowable 
combinations  of  states,  then  Equation  4  will  not  commute. 

Assume  that  the  two  bodies  are  colliding.  This  means  that  conservation  of  momentum  and 
energy  apply.  For  instance,  m1v1  +  m2v 2  —  C  must  hold  both  before  and  after  the  collision. 
This  means  that  post-collision  velocities  cannot  be  determined  independently.  For  a  perfectly 
elastic  collision,  one  would  need  to  find  a  solution  that  simultaneously  satisfies  the  equations 
both  for  the  conservation  of  linear  momentum  and  the  conservation  of  kinetic  energy.  Certain 
combinations  of  velocities  are  not  allowable. 

In  such  a  case,  the  parallel  paths  of  Equation  4  must  also  be  collapsed  into  a  single  path 
because  certain  combinations  of  observables  are  not  allowable.  Thus,  projecting  the  system 
into  two  independent  subspaces  would  allow  infeasible  combinations  of  states.  This,  in  turn, 
collapses  the  two  transition  mappings  Dx  and  D2  into  a  single  mapping  because  some 
combinations  of  elements  of  Dx  and  D2  are  forbidden.  At  this  point,  independence  between  the 
models  is  lost,  and  there  is  really  only  one  model.  This  is  evident  in  Equation  5.  In  this  case,  we 
will  call  the  models  transition  linked  because  they  must  coordinate  state  transitions. 

S/Bp - 

a 

(*1,*2)  Tf 

D 

(X1.X2) 

a. 

S/BF  < — 

Equation  5 
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This  analysis  leads  us  to  define  three  types  of  linkage  relationship  for  the  purpose  of 
engineering  model  composition: 

•  Unlinked  -  there  is  no  relationship  among  the  subsets  of  observables  of  the  various 
subsystems 

•  State  linked  -  Any  combination  of  states  among  the  subsystems  is  allowable,  but  that 
combination  affects  the  state  transition  behavior  of  each  subsystem.  Consequently, 
each  model  must  know  something  about  the  states  of  the  others. 

•  Transition  linked  -  Not  all  combinations  of  states  among  the  subsystems  are  allowable. 
Consequently,  the  transition  behavior  of  all  states  must  be  determined  simultaneously. 


It  should  be  noted  that  these  diagrams  are  not  intended  to  be  representative  of  how  one  would 
actually  build  the  model.  Rather  they  express  the  mathematical  conditions  that  must  be  met  if 
one  wanted  to  build  a  composite  model.  While  developing  composite  models  that  meet  these 
requirements  may  seem  obvious  for  the  simple  two-body  example,  it  is  not  so  obvious  when 
considering  multiple  engineering  models  capturing  different  abstractions  of  the  same  entity,  for 
example,  aerodynamic  and  thermodynamic  models  of  the  same  aircraft. 

From  an  engineering  standpoint,  the  interesting  case  is  treating  a  set  of  transition  linked 
models  as  state  linked.  This  can  occur  when  one  attempts  to  compose  two  models  by 
coordinating  data  exchange  and  synchronizing  execution  without  realizing  that  there  is  a  latent 
transition  linkage.  As  shown  above  this  would  allow  the  models  to  achieve  impossible  states. 
Returning  to  the  two  body  example,  this  would  be  equivalent  to  not  checking  the  conservation 
of  momentum  condition  after  the  collision. 


6.5  Insights  from  Multiscale  Physics  Modeling 

It  has  long  been  recognized  in  physics  that  systems  will  exhibit  qualitative  differences  in 
behavior  at  different  spatial  and  temporal  scales  (See  Section  5.1).  As  a  result,  different  sets  of 
observables  (i.e.,  abstractions)  are  applicable  at  different  scales.  Thus,  one  may  model  a  solid 
object  as  either  a  continuum  or  a  discrete  set  of  particles  depending  on  the  circumstances  and 
question  of  interest.  As  long  as  a  given  question  can  be  answered  with  a  single  abstraction,  we 
do  not  have  to  worry  about  model  composition.  However,  there  are  many  questions  that  arise 
in  engineering  and  physics  that  cannot  be  addressed  with  a  single  abstraction  either  because  of 
issues  of  computational  tractability  or  because  no  one  abstraction  can  capture  the  phenomena 
of  interest.  Addressing  such  situations  is  the  domain  of  multiscale  physics  modeling. 

Hoekstra,  et  al.  (2014)  provide  a  recent  overview  of  the  state  of  the  field.  A  central  aspect  of 
multiscale  modeling  is  what  they  call  scale  bridging.  Winsberg  (2010)  considers  this  problem  in 
depth.  He  highlights  two  approaches:  serial  multiscale  and  parallel  multiscale.  Serial  multiscale 
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is  the  most  common  and  describes  the  case  where  we  run  a  model  at  one  scale  first,  and  then 
use  it  to  parametrize  a  model  at  another  scale.  Parallel  multiscale  modeling  is  the  case  where 
the  abstractions  at  different  scales  interact  and  consequently,  the  models  cannot  be  run 
sequentially.  They  must  be  run  in  parallel.  We  will  argue  that  these  two  approaches  correspond 
with  the  state  linked  case  and  the  transition  linked  case  respectively. 

Yang  and  Marquardt  (2009)  present  a  set  theory  based  characterization  of  multiscale  modeling 
and  attempt  to  capture  both  cases.  However,  their  formulation  implicitly  relies  on  the 
reductionist  hypothesis.  That  is  the  system  can  be  decomposed  into  a  hierarchy  of 
subcomponents.  While  this  may  be  an  acceptable  assumption  under  some  circumstances,  it  is 
questionable  in  the  general  case.  As  argued  by  Pennock  and  Gaffney  (2016),  regardless  of  the 
veracity  of  the  reductionist  hypothesis,  as  a  practical  matter  we  must  contend  with  multiple 
overlapping  and  incompatible  ontologies  when  we  consider  a  system  from  multiple  views. 

To  this  point,  Winsberg  considers  the  real  world  case  of  researchers  attempting  to  build  a 
physics-based  multiscale  model  of  nano-crack  propagation  in  silicon14.  In  short,  to  model  the 
phenomenon,  one  must  simultaneously  consider  linear-elastic  theory,  molecular  dynamics,  and 
quantum  mechanics.  The  problem  is  that  these  three  theories  are  inconsistent  and 
incompatible.  To  make  the  simulation  work,  "handshaking  algorithms"  that  require  deliberate 
fictions  must  be  introduced  to  translate  parameter  values  back  and  forth  among  the  three 
views.  For  instance,  fictitious  "silogen"  atoms  are  introduced  on  the  boundary  between  the 
molecular  dynamics  view  and  the  quantum  mechanical  view.  There  is  no  such  thing  as  a  silogen 
atom,  but  it  serves  the  purpose  of  passing  state  information  between  the  incompatible  views  in 
a  manner  that  makes  the  state  transitions  for  both  views  feasible.  However,  Winsberg  also 
notes  that  these  linkage  relations  have  an  empirical  aspect.  This  would  seem  to  be  consistent 
with  observations  that  scale  bridging  approaches  tend  to  be  domain  and/or  application  specific 
(Hoekstra  et  al  2014,  Chopard  et  al  2014). 

These  observations  are  also  consistent  with  our  discussion  of  Equation  5  where  there  are 
transition  linkages  among  the  abstractions.  In  the  example  above,  researchers  are  modeling  the 
exact  same  block  of  material  as  a  continuum,  molecules,  and  quantum  particles  simultaneously. 
However,  when  one  creates  three  independent  models,  latent  linkages  among  these  views  are 
lost.  Consequently,  the  composite  model  can  achieve  states  that  are  not  achievable  in  the  real 
system.  The  state  restrictions  must  be  built  back  in  somehow.  That  is  the  role  that  these 
"fictions"  play.  However,  since  they  are  not  always  derived  from  theory,  they  must  be 
developed  via  trial  and  error  and  will  likely  be  application  specific. 

Let  us  now  consider  how  these  workarounds  from  multiscale  physics  fit  into  our  mathematical 
formulation  of  multi-modeling. 


14  The  original  work  is  documented  in  Abraham  et  al  (1998). 
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6.5.1  Workarounds  for  State  Linkages 


First,  consider  the  case  of  serial  multiscale  modeling.  This  is  effectively  a  variation  of  the  two- 
way  state  linkage  case  described  by  Equation  4.  If  one  can  assume  that  this  linkage  is  one-way, 
that  is  the  state  propagation  of  one  abstraction  depends  on  the  state  of  the  other,  but  not  the 
other  way  around,  then  the  diagram  shown  in  Equation  6  commutes. 


Equation  6 


At  first  glance,  it  might  seem  that  there  is  no  gain  from  decomposition  since  the  state  space  for 
S/RGl  is  encoded  in  the  model  for  S/Rq2.  However,  meeting  this  requirement  allows  one  to 
decouple  the  dynamic  propagation  of  S/RCi  from  S/Rq2  entirely.  Thus,  it  is  feasible  to 
compute  the  state  space  trajectories  for  S /RGl first,  then  compute  the  state  space  trajectories 
for  S/RGz,  using  the  precomputed  trajectories  of  S/RCi  as  an  input.  A  simple,  non-physics 
example  of  this  case  is  modeling  the  accumulation  of  interest  in  an  individual's  bank  account. 
The  growth  in  the  balance  is  dependent  on  the  interest  rate,  but  the  interest  rate  does  not 
depend  on  the  current  bank  balance.  Thus,  one  can  create  a  model  to  forecast  future  interest 
rates  and  then  feed  the  results  into  the  bank  account  model. 


6.5.2  Workarounds  for  Transition  Linkages 

While  it  was  shown  in  previous  sections  that,  in  the  most  general  case,  abstractions  that  are 
transition  linked  require  an  integrated  model,  the  work  in  physics-based  parallel  multiscale 
modeling  suggests  that  there  might  be  special  cases  where  one  can  work  around  this  limitation. 
The  first  case  is  when  there  is  a  refinement  relationship  between  S/RCi  and  S/RGz. 
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If  S/RGi  refines  S/RGz,  then  each  equivalence  class  of  S/RGi  intersects  exactly  one  equivalence 
class  of  S /RGz,  but  any  given  equivalence  class  of  S /RG.2  may  intersect  more  than  one  class  of 
S/RCl.  This  is  an  aggregation  relationship  between  S/RCi and  S/RGz,  which  is  equivalent  to  a 
one-way  transition  linkage.  Consequently,  the  two  abstractions  are  compatible,  and  the  linkage 
relationship  is  known,  but  S/f?Giallows  one  to  resolve  more  system  states  than  S/RGz.  Thus, 
one  could  view  it  as  a  higher  resolution  model.  Under  these  circumstances,  one  can  run  the 
model  (X1,  Dx)  first.  Then  use  it  to  parametrize  ( X2 ,  D2),  which  is  run  second.  This  illustrates 
case  of  multi-fidelity  modeling  where  one  conducts  a  limited  number  of  runs  of  the  high  fidelity 
model  to  calibrate  a  lower  fidelity  model  that  is  used  to  explore  a  larger  space.  This  situation  is 
analogous  to  that  presented  by  Yang  and  Marquardt  (2009). 

The  second  case  is  where  there  is  no  refinement  relationship  between  the  abstractions  as  was 
the  case  with  the  nano-crack  propagation  model.  To  convert  the  transition  linkage  to  a  state 
linkage,  one  can  partition  5  into  multiple  subsystems.  For  two  parallel  abstractions,  create  three 
subsystems,  Slt  S2,  and  S3,  by  employing  the  subset  of  observables  G3.  A  spatial  division  is  a 
good  example  but  not  strictly  required.  The  idea  is  to  apply  abstraction  Gx  to  and  abstraction 
G2  to  S2.  Because  the  abstractions  Gx  and  G2  are  applied  to  non-overlapping  subsystems,  there 
is  no  longer  an  implicit  transition  linkage. 

This  is  depicted  notionally  in  Figure  10.  Flere,  G3  is  a  single,  real-valued  function.  A  threshold  k 
converts  the  5  into  three  subsystems  (G3  <  k ),  S2  (G3  >  k ),  and  S3(G3  —  k ).  Flowever,  this 
creates  two  issues.  First,  since  no  abstraction  is  applied  to  S3,  the  state  information  about  this 
portion  of  the  system  is  lost.  Second,  the  state  transition  behavior  of  is  affected  by  the  state 
of  S2  and  vice  versa,  but  the  state  transition  for  S /RG  is  incompatible  with  S /RG.Z  and  vice 
versa.  The  first  issue  is  addressed  by  making  S3  as  small  as  possible.  The  second  issue  is 
addressed  by  introducing  Winsberg's  fictions.  In  essence,  S3  is  represented  by  an  artificial 
abstraction. 


G\  G2 


1 

Si 

1 

■5>3 

,s 

S2 

$ 

‘ 

\ 

K 

G3<k 

Y 

G3>k 

G3  —  k 


Figure  10  -  Notional  partition  of  the  system  S/RF  into  non-overlapping  abstractions  S1/RGiq3  ant*  S2/Rg2g3 

using  the  observable  G3 
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In  Winsberg's  example,  the  fictitious  "Silogen"  atoms  on  the  boundary  between  the  region 
modeled  using  molecular  dynamics  and  the  region  modeled  using  quantum  mechanics  serve  as 
the  artificial  abstraction.  The  net  result  of  this  approach  is  a  more  accurate  model  of  the  whole 
system  at  the  price  of  lost  information  about  overlap  region.  As  long  as  the  overlap  region  is 
small,  this  can  be  an  acceptable  price  to  pay. 

Since  this  "workaround"  converts  transition  linked  sets  of  observables  into  state  linked  sets,  the 
resulting  requirement  is  a  modification  of  Equation  4.  First,  the  original  state  space  S/RF  is 
converted  to  the  partitioned  state  space  S1/RGlGa  x  S2/RG2G3  via  the  mapping  P.  Second,  the 
"fictions"  X3  and  X4  are  introduced  to  replace  the  missing  state  information  for  S3  in  a  way  that 
is  compatible  with  each  abstraction.  The  result  is  Equation  7.  If  this  diagram  commutes,  one  can 
apply  a  parallel  multiscale  model  (or  something  analogous)  to  capture  the  behavior  of  the 
system. 


(*i,X3) 
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Equation  7 


A  few  things  to  note:  The  fictions  X3  and  X4  are  encoded  via  functions  of  the  system  state  for 
each  abstraction.  Since  the  fictions  are  not  always  defined  by  a  meter,  their  state  values  may 
not  be  the  result  of  direct  measurement.  Instead  encoding  functions  must  be  determined 
through  trial  and  error.  This  is  consistent  with  Winsberg's  observations.  The  structure  of  the 
fiction  would  be  determined  experimentally,  and  the  state  of  the  fiction  at  any  one  instant 
would  be  determined  by  a  combination  of  the  states  from  each  of  the  abstractions. 
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6.6  Implications  for  building  Enterprise  Models 


Let  us  revisit  the  questions  posed  earlier  in  light  of  the  analysis  performed.  First,  we  can  define 
a  perspective  or  abstraction  of  a  system  as  a  quotient  set  determined  by  the  selected  collection 
of  observables.  Applying  the  quotient  set  definition  leads  to  a  precise  characterization  of  the 
linkages  among  the  multiple  perspectives  of  the  same  system.  Models  of  these  system 
perspectives  inherit  these  linkages  whether  recognized  or  not. 

This  leads  to  the  obvious  conclusion  that  a  successful  composition  of  models  from  different 
perspectives  means  that  these  linkage  relationships  are  either  absent  or  explicitly 
accommodated  as  failure  to  account  for  them  allows  the  composed  model  to  achieve 
unallowable  states.  That,  in  of  itself,  is  not  particularly  interesting.  Rather  it  is  the  subsequent 
characterization  of  models  as  either  state  linked  or  transition  linked  that  is  useful.  It  allows  for  a 
precise  definition  of  what  it  means  to  be  conceptually  interoperable.  If  two  models  are 
conceptually  interoperable,  there  are  no  latent  transition  linkages  among  their  corresponding 
abstractions. 

The  justification  for  this  definition  is  as  follows.  Obviously  if  two  models  are  unlinked,  there  is 
no  interoperability  issue.  If  two  models  are  state  linked,  then  their  composition  is  valid  if  data 
exchange  and  state  transitions  are  synchronized.  This  means  that  satisfying  levels  1  through  5 
of  the  LCIM  (technical,  syntactic,  semantic,  pragmatic,  and  dynamic  interoperability)  is 
sufficient  to  achieve  interoperability  (Wang  et  al  2009).  No  additional  condition  is  required. 
Thus,  satisfaction  of  Level  6,  conceptual  interoperability,  is  implied.  However,  if  two  models  are 
transition  linked,  satisfaction  of  levels  1  through  5  is  not  sufficient.  Their  theories  are 
"inconsistent"  which  means  that  they  lack  conceptual  interoperability. 

Explaining  a  lack  of  conceptual  interoperability  as  the  presence  of  transition  linkages  among 
models,  clarifies  an  assertion  made  by  Wang  et  al.  (2009)  that  the  challenges  that  simulation 
developers  have  experienced  when  applying  HLA  can  attributed  this  to  a  lack  of  conceptual 
interoperability  among  the  federated  simulations.  If  the  simulation  models  are  state  linked, 
then  a  framework  such  as  HLA  should  be  sufficient  as  it  coordinates  execution  and  data 
exchange.  However,  if  there  are  transition  linkages  among  the  models  then  data  exchange  and 
coordinated  execution  are  insufficient  to  achieve  a  valid  composition. 

There  are  several  observations  that  follow  directly  from  the  proposed  definition  of  conceptual 
interoperability: 

•  Transition  linkages  may  vary  over  different  subsets  of  the  system's  set  of  states.  Thus, 
two  models  may  be  conceptually  interoperable  under  some  circumstances  but  not 
others.  Conceptual  interoperability  is  not  an  absolute  attribute  of  a  pair  of  models. 

•  Conceptual  interoperability  is  equivalent  to  the  case  where  each  transition  linkage  is 
contained  within  a  single  model. 
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•  Transition  linkages  among  models  may  be  removed  by  either  repartitioning  the  set  of 
observables,  F,  into  new  subsets  or  partitioning  5  into  non-overlapping  subsystems  using 
a  subset  of  F  as  the  basis  for  partition. 


6.6.1  Sources  of  Transitions  Linkages 

In  order  to  understand  how  to  mitigate  transition  linkages  among  models,  it  is  necessary  to 
consider  the  sources.  Four  sources  are  hypothesized: 

1.  Explicit:  The  transition  linkages  are  known  in  principle,  but  the  composite  model  is  large 
and  complicated.  Consequently,  they  are  difficult  to  find  and  accommodate. 

2.  Domain  Exceedance :  The  models  in  question  are  unlinked  or  state  linked  for  the  subsets 
of  5  for  which  they  were  designed,  but  they  are  unknowingly  applied  to  a  subset  of  5  for 
which  there  is  a  transition  linkage. 

3.  Intentional  Duplication:  S  is  intentionally  modeled  using  two  different  transition  linked 
abstractions  because  none  of  the  available  abstractions  would  allow  Equation  2  to 
commute  for  the  phenomena  of  interest. 

4.  Unintentional  Duplication:  A  subsystem  of  5  is  unintentionally  modeled  using  two 
different  transition  linked  abstractions  as  a  consequence  of  independent  model 
development. 


Case  1  is  the  domain  of  constraint  theory  (Friedman  &  Leondes  1969a, b,c).  The  necessary 
linkage  relations  are  present  in  the  models,  but  they  have  been  combined  in  such  a  way  that 
they  inappropriately  constrain  the  variable  space.  Constraint  theory  provides  a  means  to 
analyze  these  situations. 

Case  2  is  a  fairly  common  modeling  problem.  For  example,  in  the  two  body  model,  a  latent 
transition  linkage  would  occur  if  one  used  the  state  linked  gravity  models  but  never  checked  for 
a  collision  between  the  two  bodies. 

Case  3  is  exhibited  in  the  parallel  multiscale  physics  example.  Because,  none  of  the  available 
abstractions  could  accurately  model  the  crack  propagation,  the  researchers  combined  them. 
Figuratively,  they  are  modeling  the  same  thing  different  ways  at  the  same  time,  but  they  have 
no  other  choice. 

Case  4  is  more  subtle.  When  modeling  any  system,  one  must  often  make  assumptions  about 
that  system's  context.  If  an  observable  of  the  context  is  ignored,  then  the  modeler  is  assuming 
that  the  observable  is  unlinked  or  constant.  If  the  linkage  is  recognized,  then  the  modeler  is 
explicitly  or  implicitly  integrating  the  context  into  the  system  model. 
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For  example,  imagine  two  models  that  represent  the  dynamic  behavior  of  two  different 
projectiles.  One  model  assumes  that  the  Earth  is  flat.  The  other  model  assumes  that  the  Earth  is 
a  sphere.  However,  neither  model  explicitly  models  the  Earth.  Rather  the  Earth  is  modeled 
implicitly  as  a  consequence  of  the  selected  equations  of  motion.  Thus,  the  problem  may  not  be 
immediately  obvious  upon  inspection  of  the  models.  Yet,  if  these  two  models  are  taken  "off  the 
shelf"  and  integrated  into  a  larger  model,  they  have  implicitly  modeled  the  Earth  twice.  There  is 
a  latent  transition  linkage  that  must  be  dealt  with. 


6.6.2  Possible  Mitigations  for  Existing  Transition  Linkages 

As  noted  above,  removing  transition  linkages  among  models  involves  either  repartitioning  the 
set  of  observables,  F,  into  new  subsets  or  defining  subsystems  ofS  using  a  subset  of  Fas  the 
basis  for  partition.  Depending  on  the  circumstances,  only  one  of  the  approaches  may  be  viable. 
When  both  choices  are  available,  there  are  tradeoffs.  Repartitioning  F  is  tantamount  to 
redesigning  the  models  to  ensure  that  the  transition  linkages  are  contained  within  integrated 
models.  Partitioning  5,  on  the  other  hand,  requires  the  introduction  of  "handshake  algorithms" 
or  "middleware."  As  noted  by  Winsberg,  developing  these  may  require  trial  and  error, 
particularly  when  there  is  no  theoretical  explanation  of  the  relationship.  When  trial  and  error  is 
necessary,  the  resulting  "handshakes"  are  effectively  empirical.  It  is  tantamount  to 
interpolation  over  the  available  data  set.  Thus,  one  must  be  concerned  with  the  risk  of  model 
induced  error  when  predicting  the  consequences  of  a  design  decision  outside  of  the  training 
data  versus  when  a  fully  unified  theory  is  employed.  Real  world  modeling  efforts  may  face 
transition  linkages  from  multiples  sources,  thus,  it  will  likely  be  a  case-by-case  decision. 

The  author  considers  the  first  two  cases  to  be  instances  of  common  problems  faced  when 
composing  engineering  models.  Thus,  the  proposed  solutions  are  "standard"  to  some  extent. 
This  is  not  to  suggest  that  they  are  easy  problems  to  address.  Rather,  there  is  already  much 
work  going  on  to  address  these.  Consequently,  the  proposed  mitigations  are  only  discussed 
briefly  for  completeness.  The  author's  hypothesis  is  that  the  second  two  cases  are  major 
challenges  to  MBSE  and  MCE  approaches.  The  hypothesized  approaches  for  addressing 
transition  linkages  among  models  for  each  source  are  summarized  in  Table  4. 


Table  4  -  Hypothesized  Approaches  for  Addressing  T ransition  Linkages 


Linkage 

Source 

Preferred  Approach 

Supporting  Methods 

Explicit 

Partition  F 

Use  domain  ontologies  combined  with  formal  model 
checking  procedures 

Domain 

Partition  F 

Use  documentation  of  domain  constraints  with 

Exceedance 

formal  model  checking  procedures 

Intentional 

Partition  S 

Partition  S  into  non-overlapping  subsystems  and  use 

Duplication 

empirically  calibrated  "middleware"  to  bridge  the 
partitions 
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Unintentional 

Partition  5 

Use  domain  ontologies  and  testing  to  identify 

Duplication 

potential  linkages.  Then  partition  the  S  into  non¬ 
overlapping  subsystems 

As  mentioned  previously.  Case  1  is  addressable  via  constraint  theory.  Since  the  linkage 
relationships  are  explicitly  known,  repartitioning  Fand  developing  integrated  models  may  be 
the  preferred  approach.  Model  documentation,  formal  domain  ontologies,  and  formal  model 
checking  procedures  may  assist  modelers  with  this  assessment. 

Similarly,  the  preferred  solution  for  Case  2  is  to  repartition  F  when  the  necessary  linkage 
relationships  can  be  introduced.  To  facilitate  such  assessments,  metadata  describing  the 
conditions  under  which  the  model  is  valid  could  be  useful.  However,  there  are  limits  as  it  is 
effectively  impossible  for  model  developers  to  list  every  factor  that  they  did  not  consider.  This 
problem  is  exacerbated  when  certain  model  formulations  are  standard  for  a  domain  and  the 
model  developer  may  not  even  be  aware  of  its  limitations.  Again,  domain  ontologies  and  formal 
model  checking  procedures  may  be  helpful  in  identifying  these  linkages. 

For  case  3,  the  only  real  option  is  to  partition  5  as  was  done  in  the  crack  propagation  example. 
This  involves  choosing  a  set  of  observables  to  break  5  into  non-overlapping  subsystems.  The 
common  basis  for  partition  will  likely  be  spatial  for  most  engineering  problems.  Once  the 
partition  is  created  "middleware"  or  "handshake"  algorithms  can  be  developed  to  transfer  state 
information  across  the  partition.  This  effectively  converts  the  transition  linkages  into  state 
linkages. 

For  case  4,  there  are  no  obvious  answers.  This  case  would  typically  arise  in  situations  where  one 
wants  to  reuse  existing  models  and  simulations,  but  they  are  black  boxes.  If  that  is  the  case, 
partitioning  S  may  be  the  only  viable  option.  Domain  ontologies  may  aid  in  identifying  typically 
assumed  objects  and  relations  for  a  given  application  area.  This  may  support  targeted  testing 
and  evaluation  of  a  candidate  model  to  infer  how  relevant  phenomena  were  implicitly 
modeled.  For  example,  if  a  domain  ontology  or  other  documentation  indicated  that  there  is 
relationship  between  the  projectile  and  the  Earth,  this  could  cue  a  modeler  to  evaluate  a 
candidate  model  to  infer  the  assumed  representation  of  the  earth:  flat,  spherical,  oblate 
spheroid,  etc.  If  a  duplicate  representation  is  detected,  it  may  be  possible  to  handle  it  via 
partitioning  of  S,  but  it  may  require  trial  and  error  to  develop  a  calibrated  "handshake"  among 
the  partitions. 


6.7  Implications  for  multi-level  modeling 

Reflecting  on  the  four  sources  of  transition  linkages  and  the  associated  mitigations,  there  are 
several  implications  for  building  a  composite  model  from  multiple  existing  models  or  theories.  If 
the  sources  of  transition  linkages  among  candidate  models  are  limited  to  cases  1  and  2,  then 
methods  commonly  suggested  to  facilitate  model  interoperability  including  ontologies,  model 
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metadata,  and  formal  model  checking  procedures  may  be  effective.  (This  is  analogous  to  MDO.) 
However,  two  aspects  of  multi-level  modeling  that  risk  triggering  case  3  and  case  4  sources  of 
transition  linkages.  The  first  is  the  necessity  of  employing  multiscale  ontologies.  This  runs 
immediately  into  case  3,  which  means  that  "handshake"  algorithms  will  be  required  and  these 
may  be  empirical  and  case  specific.  The  second  is  the  desire  use  off-the-shelf  models  in  a  "plug 
and  play"  fashion.  Because  off-the-shelf  models  may  have  been  designed  for  any  number  of 
purposes,  then  case  4  sources  of  transition  linkages  are  likely.  Again,  addressing  these  may 
require  case  specific  "handshake"  algorithms. 

The  case  3  issues  are  fundamental,  and  only  new  scientific  theories  can  permanently  resolve 
them.  This  leaves  modelers  with  problem  or  domain  specific  solutions.  The  case  4  issues  may 
also  be  resolvable  in  a  problem  or  domain  specific  way,  but  this  defeats  the  intent  of  general 
"plug  and  play"  model  composition.  Note  that  all  of  the  approaches  to  mitigating  transition 
linkages  among  models  are  more  tractable  in  a  stable  problem  space.  This  would  be  consistent 
with  assertions  that  multi-scale  modeling  is  more  likely  to  be  successful  when  domain  focused. 

This  is  to  be  expected  because  an  unmodeled  transition  linkage  is  essentially  information  about 
the  system  that  is  lost  as  a  part  the  reduction  process.  To  account  for  the  linkage,  that 
information  has  to  be  put  back  in  the  model.  Thus,  experience  acquired  through  trial  and  error 
serves  as  a  basis  for  restoring  the  missing  information.  However,  this  is  essentially  an  exercise 
in  interpolation.  Thus,  applying  the  composed  model  outside  of  the  experience  base  incurs 
substantial  risk  of  model  induced  error. 

Still,  even  in  domain  focused  situations,  there  are  likely  approaches  to  model  development  and 
model  selection  that  would  reduce  the  risk  of  unmanaged  case  4  linkages  going  forward.  The 
analysis  presented  here  leads  to  several  research  questions  toward  that  end: 

•  Are  there  indicators  that  could  be  used  to  identify  which  analysis  efforts  would  be  at  risk 
of  incurring  case  3  and  4  linkages  before  attempting  to  build  a  composed  model? 

•  Are  existing  methods  of  designing  for  model  reusability  effective  at  minimizing  case  4 
linkages? 

•  What  is  the  appropriate  level  of  abstraction  to  target  interoperability  standards  and  tool 
development  to  minimize  the  risk  of  case  4  linkages? 

o  Is  the  appropriate  level  of  abstraction  domain  specific? 

o  Would  Doyle  and  Csete's  (2011)  advocated  "bowtie"  architectural  approach  to 
reuse  help  reduce  case  4  linkages? 

•  Are  there  certain  levels  of  abstraction  that  are  less  prone  to  case  4  linkages?  Does  this 
explain  why  certain  software  tools  seem  to  be  extremely  reusable  while  others  are  not? 
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6.8  Extensions  to  Category  Theory 


One  of  the  challenges  of  the  problem  formulation  in  the  previous  section  is  that  it  is  difficult  to 
apply  the  principles  to  more  than  two  models  in  a  practical  sense.  For  example,  if  one  wanted 
to  compose  four  models  instead  of  two,  one  would  need  to  check  for  transition  linkages  for 
each  pair  of  models,  resulting  in  six  comparisons.  As  the  number  of  models  increases,  the 
number  of  required  comparisons  increases  rapidly.  For  instance  5  models  would  require  10 
comparisons  and  10  models  would  require  45  comparisons.  This  does  not  even  account  for 
making  the  models  mutually  compatible  if  transition  linkages  are  found.  This  is  a  serious 
impediment  to  implementing  a  practical  model  composition  and  switching  approach  to  support 
enterprise  modeling. 

One  promising  avenue  to  address  the  problem  is  the  application  of  a  branch  of  mathematics 
called  category  theory.  While  category  theory  will  not  tell  you  how  to  eliminate  the  transition 
linkages  among  models,  its  capability  to  support  abstraction  could  provide  the  ground  rules  for 
proper  model  composition  and  a  way  to  reduce  the  number  of  comparisons  required  as  each 
new  model  is  added  to  the  enterprise  analysis  inventory. 

One  way  this  might  work  is  demonstrated  by  Wisnesky  et  al  (2017).  They  apply  a  category 
theory  based  query  language  they  call  FQL  to  show  how  heterogeneous  databases  could  be 
integrated  without  performing  an  exhaustive  number  of  comparisons.  In  a  sense,  once  two 
databases  are  combined  they  become  a  new  database.  So  when  another  heterogeneous 
database  is  introduced,  it  only  needs  to  be  compared  to  the  new  database  not  the  original  two. 
This  problem  is  analogous  to  the  model  composition  problem.  Of  course  there  are  some 
caveats  and  technical  issues  here,  but  it  is  still  a  promising  direction  of  future  research. 

One  additional  feature  of  category  theory  is  that  it  may  serve  as  a  convenient  language  to 
describe  and  manage  heterogeneous  models.  This  has  been  recognized  by  both  Rosen  (1978) 
and  Baez  and  Stay  (2010).  The  reason  is  that  in  naturally  incorporates  the  idea  of  abstraction 
and  focuses  on  the  relations  among  abstractions  rather  than  the  internals  of  the  abstraction. 
Consequently,  one  can  naturally  build  networks  of  categories  that  are  created  by  adding  and 
removing  assumptions  (axioms)  from  categories.  Adding  and  removing  structure  from  models  is 
analogous.  Again  category  theory  will  not  do  the  work  for  the  modeler,  but  it  may  provide  a 
powerful  language  to  describe  problems,  establish  necessary  conditions,  and  organize  models. 
To  put  it  another  way,  category  theory,  by  itself,  provides  no  information  about  the  real  world. 
Flowever,  it  may  provide  guidance  as  to  how  to  organize  information  about  the  real  world  in  an 
intelligent  way.  While  much  research  is  still  needed  to  determine  whether  not  category  theory 
will  be  useful  in  a  practical  sense,  it  did  provide  the  inspiration  for  the  approaches  proposed  in 
Section  7. 

7  Approach  for  Identifying  and  Counter-Intuitive  Policy  Impacts 


Let  us  recap  the  analysis  to  this  point.  The  identification  of  unintended  policy  consequences  in 
an  enterprise  system  will  like  require  the  systematic  exploration  of  alternative  model  structures 
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that  likely  use  overlapping  and  possibly  inconsistent  abstractions  (ontologies)  that  may  exist  at 
different  scales.  The  previous  section  identified  transition  linkages  among  these  abstractions  as 
an  inhibitor  to  composing  the  associated  models.  One  source  of  these  transition  linkages  is 
overlapping  representations,  which  is  almost  by  definition  the  motivation  behind  multi-level 
modeling.  Thus,  a  literal  implementation  of  multi-level  models  where  we  swap  different  models 
in  and  out  for  each  layer  is  infeasible.  This  was  observation  was  also  made  during  RT-138 
(Pennock  et  al  2016),  but  now  there  is  a  mathematical  explanation  for  this  phenomena. 

Reflecting  on  the  literature  review,  we  now  see  the  mathematical  analysis  presented  in  Section 
6  also  provides  a  mechanism  to  describe,  at  least  a  high  level,  the  differences  in  approach 
between  the  physical  sciences  and  the  social  sciences.  Essentially,  when  faced  with  a  situation 
where  no  one  available  abstraction  can  explain  a  phenomenon,  the  physical  sciences  partition  S 
and  the  social  sciences  refactor  F.  This  also  provides  some  insight  into  why  ontologies  tend  to 
proliferate  in  the  social  sciences.  This  means  that  any  systematic  approach  to  varying  enterprise 
model  structure  must  explicitly  account  for  both  possible  approaches  to  removing  transition 
linkages. 

In  this  section,  we  first  describe  the  implications  of  the  two  approaches  to  removing  transition 
linkages  and  the  resulting  implications  for  how  enterprise  models  should  be  built,  analyzed,  and 
used.  Once  that  is  established,  we  consider  the  implications  for  model  validation.  Finally,  we 
present  a  tentative  approach  to  systematically  navigating  the  space  of  possible  models. 


7.1  A  Systematic  Approach  to  Enterprise  Model  Development 


7.1.1  Limitations  of  Existing  Approaches 

Before  developing  a  systematic  approach  to  enterprise  model  development,  it  is  necessary  to 
consider  how  models  are  built  across  multiple  overlapping  abstractions  today.  Based  on  the 
analysis  of  the  literature  (Section  5),  we  contend  that  there  are  really  two  basic  approaches, 
though  actual  model  implementations  may  mix  the  two.  First,  we  will  consider  the  typical 
multi-scale  modeling  approach  form  the  physical  sciences.  A  notional  illustration  of  this  process 
is  described  in  Figure  11. 
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Figure  11  -  Approach  1:  a)  Initial  conceptualization  of  the  system  will  likely  involve  mixtures  of  many  factors  and 
relationships  from  different  absractions.  b)  Certain  abstractions,  often  those  defined  by  scientific  theories,  are 
very  well  understood  and  predictable  in  isolation.  A  natural  organizational  scheme  is  to  sort  the  factors  from  the 
intial  conceptual  model  into  abstractions  defined  by  theories.  When  these  abstractions  overlap,  it  is  natural  to 
organize  them  into  levels.  This  often  done  by  spatial  scale,  but  that  is  not  strictly  required,  c)  Since  the 
conceptual  model  is  now  organized  by  abstraction,  there  is  often  a  natual  mapping  of  each  level  to  a  canonical 
mathematical  or  computational  model.  However,  this  creates  an  issue.  There  may  be  no  obvious  or  even 
theoretically  backed  way  to  relate  the  canonical  models  of  the  three  layers.  There  is  lost  information,  d)  In 
multi-scale  modeling,  the  system  is  partitioned  into  zones  and  each  model  applies  to  a  different  zone.  However, 
this  creates  mismatches  on  the  boundaries  that  must  be  rectified  using  empirical  data. 


First,  a  conceptual  model  of  the  system  is  built.  There  are  many  possible  ways  this  may  be  done 
including  influence  diagrams,  causal  loop  diagrams,  and  systemigrams  just  to  name  a  few.  The 
important  thing  is  that  potentially  relevant  objects  and  the  relationships  among  them  are 
identified.  At  this  stage,  the  objects  may  be  vague,  come  from  traditionally  different  or  even 
incompatible  ontologies.  In  order  to  build  a  useful  model,  one  must  make  use  of  symmetries. 

In  the  physical  sciences  in  particular,  these  symmetries  have  been  grouped  into  theories  that 
provide  useful,  tested  ways  to  represent  certain  phenomena.  Thus,  the  goal  of  the  modeler  is 
to  reorganize  the  mixture  of  objects  into  well-defined  groupings,  where  each  grouping  is 
associated  with  a  conventional  abstraction  and  theory.  If  this  can  be  done,  there  are  often  well 
defined  and  developed  modeling  approaches  to  represent  each  grouping  in  isolation.  The 
problem  is  that  now  we  may  have  multiple  groupings.  If  there  are  no  transition  linkages  among 
the  groupings  then  one  may  be  able  to  proceed  with  model  integration  at  this  point.  However, 
as  we  saw  in  the  multi-scale  literature  there  are  some  problems  where  there  are  overlapping 
representations.  These  create  transition  linkages  among  the  models  that  must  be  removed.  The 
typical  approach  in  multiscale  modeling  is  to  partition  the  system  such  that  a  different  model 
applies  to  each  region.  However  buffers  are  often  introduced  and  empirical  "handshake 
algorithms"  must  be  developed.  Thus,  while  each  of  the  individual  models  may  be  well 
validated,  this  validation  does  not  automatically  pass  to  the  composite  model. 
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Next,  we  consider  a  refactoring  approach  more  typical  to  the  social  sciences  and  some 
engineering  applications.  A  notional  illustration  is  provided  in  Figure  12.  As  with  the  previous 
case,  we  start  with  a  conceptual  model  of  the  system.  In  this  case,  however,  the  objects  may 
not  map  cleanly  into  well-established  abstractions.  Consequently,  the  modeler  refactors  the 
objects  and  relationships  until  a  single  abstraction  is  created.  This  is  likely  done  through 
combination  of  several  mechanisms.  The  most  obvious  is  dropping  objects  and  relationships 
that  may  be  deemed  either  unimportant  or  that  are  too  difficult  to  handle.  It  may  also  involve 
replacing  or  merging  objects  and  relationships  with  approximations  that  are  more  compatible 
with  other  objects  in  the  model.  Finally,  in  the  most  extreme  case,  the  modeler  may  create  new 
objects  and  relationships  from  empirical  data  (see  LVT  analysis  in  Section  5.2).  Once  this 
"refactoring"  is  complete,  the  transition  linkages  have  been  managed,  and  the  modeler  has 
created  a  single  internally  consistent  abstraction.  The  problem  is  that  this  "new"  abstraction  is 
essentially  untested.  Any  standard  abstractions  or  theories  that  may  have  applied  to  the 
original  conceptual  model  may  have  been  altered.  Thus,  as  with  the  previous  case,  we  have 
created  a  validation  issue,  and  it  is  unclear  to  what  extent  the  predictions  of  the  model  can  be 
trusted. 


b) 


•  *■ 


Figure  12  -  Approach  2:  a)  Initial  conceptualization  of  the  system  will  likely  involve  mixtures  of  many  factors  and 
relationships  from  different  absractions.  b)  Since  the  abstractions  are  not  necessarily  compatible,  the  modeler 
modifies  the  factors  and  relationships  to  create  a  conceptually  consistent  model.  This  may  be  accomplished 
through  a  combination  of  dropping  factors,  creating  approximations,  selecting  alternative  representations,  etc. 
c)  Since  there  is  now  an  internally  consistent  conceptual  model,  it  can  be  represented  using  a  single,  consistent 
mathematical  or  computational  model.  However,  the  process  of  modifying  the  conceptual  model  likely  lost 
information  contained  in  the  orginal  abstractions.  Consequently,  the  "refactored"  model  should  be  compared  to 
empircai  data  and  adjusted  to  compenstate  for  lost  information. 

In  both  cases  we  correct  the  compatibility  problems  by  fitting  to  data,  but  in  the  process  we 
lose  some  of  the  predictive  power  of  the  original  theories  we  leveraged.  While  at  first  glance, 
this  may  not  seem  to  be  an  issue  because  the  model  is  tested  against  the  data,  what  we  have 
done  is  effectively  created  a  local  fit.  This  means  that  we  may  have  lost  the  ability  to  generate 


a) 


c) 
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the  full  range  of  potential  scenarios  that  may  result  from  a  policy.  This  issue  is  described 
notionally  in  Figure  13. 


a 


c) 


d) 


Figure  13  -  a)  When  a  model  is  developed  and  validated  against  historical  data  then  used  for  prediction,  it  is 
equivalent  to  extrapolating  a  trend,  b)  Parametric  sensitivity  analysis  or  frequentist  prediction  intervals  put 
upper  and  lower  bounds  on  the  trend  but  would  not  be  able  to  detect  a  shift  in  the  trend  triggered  by  structural 
changes,  c)  Systematically  introducing  alternative  structure  to  the  model  can  generate  alternative  trends,  d)  The 
ideal  output  of  such  an  analysis  would  be  a  multi-modal  probably  distribution  of  potential  outcomes. 

Once  a  composite  model  is  built  and  evaluated  against  empirical  data,  we  now  have  the  ability 
to  generate  predictions  (a).  However,  we  know  that  there  is  uncertainty  in  the  model  so  we 
perform  sensitivity  analysis  or  apply  a  more  rigorous  uncertainty  quantification  approach.  The 
problem  is  that  this  typically  done  over  the  parameters  of  the  model  as  there  is  a  natural  space 
to  vary  these  over.  This  results  in  a  distribution  of  possible  outcomes  represented  by  a 
prediction  interval  in  Figure  13  (b).  However  due  to  the  model  development  processes 
described  in  Figure  11  and  Figure  12,  some  of  the  predictive  power  of  the  original  theories  is 
lost.  We  can  think  of  it  as  some  of  the  structure  has  been  thrown  out  either  through  the 
partitioning  or  refactoring  processes.  Furthermore,  there  may  have  been  alternative 
abstractions  that  could  have  applied  to  the  original  conceptual  model,  but  for  whatever  reason, 
were  not  selected.  This,  too,  is  lost  structure.  If  there  were  a  way  to  reintroduce  this  discarded 
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structure,  it  may  generate  predictions  that  are  very  different  from  the  ones  produced  by  the 
built  model  (c). 


Depending  on  the  circumstances,  these  alternative  trajectories  may  be  assigned  an  extremely 
low  probability  under  conventional  sensitivity  analysis.  That  is  why  when  these  trajectories  do 
occur  in  real  life  they  are  "unexpected,"  "unintended,"  or  "counterintuitive."  What  we  would 
rather  have  is  a  more  justifiable  approach  to  developing  model  predictions  by  systematically 
introducing  this  lost  structure.  This  would  result  in  a  more  complete  probability  distribution  of 
potential  policy  outcomes.  Notionally,  this  could  be  viewed  as  recovering  modes  in  the 
distribution  that  were  lost  as  consequence  of  the  model  building  process  (d). 


7.1.2  Factors  for  Consideration 

If  we  would  like  to  systematically  explore  variations  in  model  structure  to  generate  a  spread  of 
scenarios  like  those  depicted  in  Figure  13,  there  are  several  factors  that  we  must  consider.  First, 
when  developing  a  composite  model,  we  are  not  totally  unconstrained  in  our  selection  of 
abstractions.  For  any  policy  (or  design)  problem  there  are  typically  a  limited  set  of  factors  under 
our  (or  the  policy  maker's)  control  as  well  as  particular  set  of  consequences  that  we  hope  to 
achieve  (or  avoid).  This  naturally  leads  us  to  a  limited  set  of  control  variables  and  response 
variables.  These  significantly  constrain  the  model  development  process  as  any  model  built  must 
provide  a  complete  linkage  from  control  variables  to  response  variables.  When  the  control 
variables  and  response  variables  lie  in  what  are  traditionally  considered  separate  abstractions, 
this  can  be  very  challenging.  Unfortunately,  this  is  the  usual  case  for  an  enterprise  problem. 

While  not  an  explicit  motivation  for  the  core-peripheral  approach  developed  during  RT-138,  in 
retrospect,  this  was  probably  the  reason  that  the  approach  emerged.  The  linkage  between  the 
control  variables  and  response  variables  necessary  forms  the  core.  If  this  linkage  is  not  valid, 
then  the  entire  modeling  effort  is  useless.  Once  the  core  is  established,  variations  in  structure 
or  higher  order  structure  can  be  introduced  to  trigger  "higher  order"  effects  on  the  predictions 
of  the  core.  These  are  the  peripheral  models.  In  essence,  we  are  looking  for  factors  that 
"disrupt"  the  control  linkage. 

Flowever,  reflection  on  both  the  literature  review  and  the  mathematical  analysis  on  multi-scale 
ontologies  suggests  that  there  are  probably  several  different  cases  that  a  modeler  may 
encounter.  Each  case  may  need  to  be  addressed  in  a  different  way.  Flere  we  lay  out  each  of  the 
cases  we  have  identified,  though  we  note  additional  cases  may  be  identified  through  future 
work. 

Case  1:  Direction  integration  of  peripheral  models  with  core  model 

In  this  case,  only  state  linkages  exist  between  the  core  and  peripheral  models.  As  a  result, 
peripheral  models  may  be  swapped  in  and  out  as  needed.  While  this  may  happen  due  to 
"luck,"  the  more  likely  situation  is  that  the  modeler  refactored  the  conceptual  model  to  remove 


Report  No.  SERC- 2017- TR- 106 


103 


Date  April  30,  2017 


transition  linkages  between  the  core  and  peripheral  models.  In  retrospect,  this  was  the 
approach  taken  during  the  development  of  the  counterfeit  part  intrusion  model  during  RT-110 
and  RT-138. 

Case  2:  Separate  peripheral  models  with  handover  to  core 

In  this  case,  there  are  one-way  transition  linkages  between  the  peripheral  models  and  the  core. 
This  case  is  analogous  to  multi-fidelity  modeling  approaches.  The  peripheral  models  may  be  run 
first  and  then  the  resulting  outputs  can  be  handed  over  to  the  core  model  as  alternative 
parameter  values.  The  core  model  is  then  run  second. 

Case  3:  Partitioning  the  state  space 

In  this  case,  the  two-way  transition  linkages  cannot  be  eliminated  through  refactoring. 
Consequently,  the  only  solution  is  to  partition  the  state  space  and  apply  different  abstractions 
to  different  portions  of  the  state  space.  Here  peripheral  models  are  alternative  representations 
of  these  portions  of  the  state  space.  This  is  analogous  to  multi-scale  modeling  where  the 
alternative  models  for  each  scale  are  switched  in  and  out.  The  problem  here  is  that  potentially 
new  empirical  "handshake"  algorithms  may  need  to  be  developed  for  each  combination  of 
models. 

While  all  three  cases  are  important,  for  the  approach  developed  in  this  report,  we  will  focus  on 
case  1  as  it  is  the  most  tractable.  However,  it  is  expected  that  the  developed  approach  is 
extendable  to  facilitate  cases  2  and  3.  It  is  likely  that  category  theory  will  play  a  role  in 
accomplishing  this  extension. 

7.1.3  Steps  of  the  Approach 

The  systematic  approach  to  developing  an  interoperable  core-peripheral  model  is  illustrated 
notionally  in  Figure  14.  Essentially,  this  approach  combines  elements  of  both  of  the  existing 
approaches  presented  above.  Again  we  start  with  the  conceptual  model  (a).  This  time, 
however,  we  identify  the  control  and  response  variables,  and  identify  the  relevant  paths 
between  them.  These  paths  are  candidates  for  establishing  the  core  model.  At  a  minimum,  the 
core  model  must  include  at  least  one  path  from  control  to  response,  but  multiple  may  be 
included.  Presumably,  the  core  will  include  what  are  perceived  to  be  the  most  "important" 
factors.  In  essence,  this  would  be  the  "first  order"  representation  of  the  system  (b).  This  core  is 
then  represented  using  some  combination  of  refactoring  or  partitioning  to  manage  transition 
linkages  and  create  an  internally  consistent  model  (c).  The  portions  of  the  conceptual  model 
that  were  omitted  from  the  core  are  candidates  to  become  peripheral  models.  These  are 
refactored  to  eliminate  transition  linkages  between  them  and  the  core.  Note  that  there  may  be 
more  than  one  valid  formulation  of  each  peripheral  models,  particularly  when  the  peripheral 
models  represent  behavioral  and  social  factors.  Finally,  mathematical  and/or  computational 
models  are  developed  for  the  core  and  peripheral  models  (d). 
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Figure  14  -  Systematic  Approach  2:  a)  Initial  conceptualization  of  the  system  will  likely  involve  mixtures  of  many 
factors  and  relationships  from  different  absractions.  b)  The  modeler  identifies  the  control  and  response 
variables  and  identifies  the  most  important  chain  of  relationships  between  them.  This  constutes  the  core,  c) 
Since  the  abstractions  are  not  necessarily  compatible,  the  modeler  modifies  the  factors  and  relationships  to 
create  a  conceptually  consistent  model  consistent  core  model.  Factors  that  are  off  the  core  paths  are  candidates 
for  periperhal  models.  These  are  refactored  to  eliminate  an  transition  linkages  with  the  core.  There  may  be 
more  than  one  version  of  each  peripheral  model,  d)  Since  there  are  now  a  set  of  mutually  consistent  conceptual 
models,  they  can  be  represented  using  consistent  mathematical  or  computational  models. 

Since  the  transition  linkages  have  been  eliminated,  the  core  model  can  be  mixed  and  matched 
with  the  various  peripheral  models  to  generate  alternative  trajectories.  Each  combination 
would  generate  a  different  scenario.  If  probabilities  are  assessed  for  each  combination  of 
models,  it  becomes  feasible  to  generate  a  probability  distribution  for  the  set  of  possible 
outcomes. 

A  few  things  to  note: 

•  There  is  no  guarantee  that  it  will  be  possible  to  refactor  the  conceptual  model  such  that 
there  are  no  transition  linkages  among  the  core  and  peripheral  models.  As  we  noted 
above,  we  are  focusing  case  1  first.  The  presence  of  transition  linkages  would  trigger 
either  case  2  or  3.  In  principle,  these  can  be  accommodated,  but  will  require  extra  steps. 
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•  It  is  critically  important  to  validate  the  core  model.  This  serves  as  the  baseline  for  the 
subsequent  analysis.  If  it  is  not  credible,  then  none  of  the  results  will  be  credible. 

•  The  integration  of  one  or  more  peripheral  models  into  the  core  generates  alternative 
predictions,  but  there  is  no  guarantee  that  they  are  real.  However,  they  do  establish 
possibilities  that  could  be  further  investigated  or  hedged. 

•  While  one  could  probably  perform  a  full  factor  analysis  for  a  relatively  small  number  of 
peripheral  models,  as  a  practical  matter,  there  may  be  many  possible  combinations  of 
peripheral  models.  This  will  be  particularly  true  for  peripheral  models  that  represent 
complex  phenomena  such  as  behavioral  and  social  factors.  Consequently,  there  is  a 
need  for  an  approach  to  both  identify  potential  model  formulations  as  well  as  navigate 
through  them  in  a  reasonable  way.  This  will  be  addressed  in  the  next  few  sections. 


7.1.4  Validation 

One  obvious  question  about  the  above  approach,  is  that  of  validation.  What  we  propose  is  that 
validation  efforts  focus  on  core  model.  The  reason  being,  as  stated  above,  is  that  if  the  first 
order  model  is  not  credible,  then  the  rest  of  the  modeling  effort  is  irrelevant.  Thus,  this  model 
should  undergo  rigorous  evaluation  by  subject  matter  experts  and  tests  against  data  as 
appropriate. 

Note  that  it  is  critical  that  the  core  model  be  validated  in  isolation  from  the  peripheral  model 
for  two  reasons.  First,  the  core  model  represents,  in  a  sense,  business  as  usual  while  the 
peripheral  models  represent  departures  from  business  as  usual.  One  cannot  assess  the  impact 
of  the  departure  if  the  model  representing  business  as  usual  is  miscalibrated.  Second, 
peripheral  models  introduce  additional  degrees  of  freedom.  Attempting  to  validate  the 
combined  model  is  essentially  self-defeating  as  it  increases  data  requirements  for  validation 
and  confounds  the  very  relationships  we  are  attempting  to  tease  out.  In  essence  it  devolves  to 
the  interpolation  case  described  previously,  which  is  exactly  what  we  are  trying  to  avoid. 

As  far  as  the  peripheral  models  themselves,  they  should  be  structurally  valid,  meaning  that  they 
conform  to  known  theory  or  data,  but  it  is  less  important  to  validate  them  in  isolation.  Their 
role  is  to  support  the  equivalent  of  "what  if"  analyses.  So  the  question  is  not  necessarily 
whether  we  know  them  to  be  representative  of  current  or  projected  circumstances,  but  rather 
we  want  to  know  what  will  happen  if  we  assume  they  are  representative. 

Once  experiments  are  run  by  varying  combinations  of  peripheral  models  with  a  validated  core, 
there  is  a  question  of  the  validity  of  the  predictions  themselves.  While  this  is  somewhat 
dependent  on  the  circumstances  and  the  system  being  model,  we  may  never  know.  There  may 
be  some  circumstances  where  we  can  run  an  experiment  or  collect  some  additional  data  to  test 
a  prediction.  However,  when  we  are  concerned  with  policy,  we  may  be  dealing  with 
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counterfactuals.  For  instance,  when  a  government  chooses  a  particular  macro-economic  policy, 
we  will  never  know  what  would  have  happened  if  they  had  chosen  a  different  policy.  In  these 
cases,  the  various  scenarios  generated  by  this  approach  may  be  untestable.  Subject  matter 
experts  may  be  able  to  eliminate  some  based  on  infeasibility,  but  the  remainder  should  at  least 
be  considered  as  potential  scenarios  in  the  decision  making  process.  These  can  feed  strategy 
development  much  as  scenario  analysis  is  used  in  strategic  planning  today. 


7.2  Potential  Directions  for  organizing  Models  for  Use  in  Enterprise  Analysis 

As  discussed  in  Section  5.2,  there  are  issues  and  challenges  associated  with  formulating  models 
from  social  theory.  Given  both  the  variety  of  different  models  and  their  associated  ontologies, 
there  is  a  question  of  how  these  should  be  captured  and  organized.  Of  interest  is  providing  a 
means  for  capturing  theory  that  contains  some  social  phenomena  that  can  both  maintain  the 
domain  validity  issues  and  "store"  theory  such  that  a  conceptual  model  can  be  presented.  Then 
the  conceptual  model  can  be  "compressed"  and  assessed  applying  the  modeling  and  simulation 
principles  described  in  Section  6.  The  motivation  here  is  to  catalogue  social  theory  validity  in 
the  appropriate  model  theory.  Then  observations  in  the  enterprise  system  of  interest  would 
have  a  "place"  in  an  enterprise  model  theory.  The  goal  is  an  approach  and  schema  for 
transforming  these  into  a  systematic  approach.  In  this  section,  we  consider  the  required 
identifications,  relations,  and  'higher-order'  types.  Next,  we  describe  the  problem  up  to  ordinal 
theory  and  develop  a  systematic  means  for  constructing  categories  as  a  working  hypothesis. 
This  provides  useful  encapsulations  for  social  theoretic  components  and  highlights  categorical 
relationships  as  potential  research  domain.  It  is  hoped  that,  when  fully  developed,  this 
approach  will  present  a  useful  framing  to  maintain  consistency  within  social  theory  and  provide 
a  formal  mechanism  to  guide  practitioners  and  modelers  when  integrating  social  theories  into 
their  enterprise  models.  It  should  be  emphasized  that  the  following  presented  approach  is  just 
a  tentative  hypothesis  intended  to  provide  a  starting  point  for  investigation.  Much  additional 
research  is  required,  and  it  is  expected  what  is  proposed  here  will  evolve  substantially. 

7.2.1  Social  Constructs  &  Nomological  Network 

Continuing  from  the  social  measure  theory  review  in  Section  5.2,  one  ends  up  with  measures 
that  provide  functional  mappings  but  only  after  sufficient  definitional  search.  Ideas  of  how  to 
provide  appropriate  'objects'  and  their  valid  relationships  (construct  validity)  were  touched 
upon.  There  were  two  'approaches'  to  the  analysis  (EFA  and  CFA)  with  the  intent  to  get  an 
available  extendable  representation  (PCA).  The  crux  of  the  analysis  was  considering  the 
'higher-orders'  under  the  appropriate  structural  context  which  involved  assessing  more  than 
just  the  internally  available  observations.  This  gave  a  causal  description  but  these  theoretic  and 
model  statements  were  not  'nice'  in  the  deductive  falsifiable  sense.  These  descriptions  while 
measurable  were  not  (necessarily)  identifiable,  and  if  identifiable  consciously,  were  not 
(necessarily)  measurable.  We  reason  here  that  this  defines  any  theory  constructs  itself  in  some 
ordinal  above  the  modellable  compression. 
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We  then  showed  that  there  were  some  implications  where  categorical  structuring  was 
applicable  in  some  situations  and  similarly  that  people  present  categorical  'choicing'  in  the 
model  theoretic  sense.  In  sterile  measurement  conditions  or  by  bounding  the  'system  of 
interest',  these  bounds  provide  identifiable  decisions,  measurables,  and  other  model 
descriptions.  But  'in  the  real  world',  we  attend  to  a  multitude  of  unbounded  decisions  in  which 
individual  and  group  choice  induce  orderings.  Behavioral  economics  and  social  information 
theories  argue  that  certain  behavior  localizes  to  a  bias  estimate  (Tversky  &  Kahneman  1981),  an 
ordering  (Hayek  1945),  and  a  choice  (Arrow  1951)  emerges  without  any  individual  being 
directive.  These  then  might  not  be  'measurable'  even  if  subjectively  reported,  or  conversely,  if 
it  is  reportable,  it  might  not  satisfy  all  possible  descriptions.  This  to  say  that  a  particular 
universal  descriptor  may  not  be  complete  as  a  compression,  a  priori,  and  this  by  definition 
would  have  consequences  'unintended'  by  either  the  individual  or  a  group  policy  response. 

With  this  in  mind,  there  is  a  unique  perspective  toward  modeling  when  a  component  theory 
involves  a  domain  of  psychological  or  social  science.  There  is  not  only  the  potential  for  yet 
'unseen'  effects  with  traditional  observation,  but  there  are  potential  reorderings  in  the  algebra, 
and  the  representation  of  that  algebra  might  diverge  compared  to  the  individual's  'mental 
model'  structure.  Even  then  there  is  a  post-modernist  perspective  that  under  a  language 
individuals  'bifurcate'  and  this  adds  a  complexity  under  this  'intermittent'  algebra  (Susen  2015). 
This  would  be  even  less  accessible  than  the  previous  as  this  extends  the  space  under 
measurement  (i.e.  the  individual  has  a  measurable  extension  outside  group  setting  that  the 
individual  is  involved  in).  These  distinctions  make  for  dual  considerations  plus  extensions 
within  language  all  within  any  'psycho-social'  component.  By  argument,  we  propose  that  this 
points  to  a  basis  in  typology  and  potential  classes  over  model  validity  that  are  needed. 

Since  both  the  'individual'  and  'group'  objects  involved  in  a  model  can  have  differing 
associations  under  any  instance,  one  involves  an  ordinal  function  in  any  multi-scale  or  multi¬ 
ontology  modeling,  and  this  directly  influences  how  the  objects  operate  influencing  the 
'appropriate'  simulations.  This  opens  the  discussion  to  logics  embedded  within  the  'post¬ 
modernist'  tradition,  but  as  a  measure  theory,  these  traditions  have  not  been  sufficiently 
defined.  While  we  will  touch  on  the  notions,  the  engineering  need  is  to  translate  the 
encountered  discussion  into  an  associated  model  theoretic  framework  if  only  to  allow  us  to 
internalize  the  framework  limits.  Ideally  any  'nice'  engineering  approach  would  try  to  objectify 
and  reduce  where  possible  or  efficient,  but  these  lack  in  topological  constraint.  Clearly,  there  is 
a  need  to  identify  where  the  models  are  limited  and  to  develop  associated  guidance  of  their 
application.  One  would  also  want  a  pseudo-metric  on  the  associated  contribution  to  model 
risk;  for  example  needed  additional  inductive  assessments  or  potential  consequences  of  an 
additional  constraint.  While  these  social  theories  are  outside  of  the  conventional  systems 
engineering  domain,  the  point  is  to  make  available  of  a  variety  of  programs,  logics,  and 
representational  objects  available  given  by  social  scientific  theories  to  enterprise  modelers. 
However,  this  requires  systematic  attention.  These  'human  and  social  phenomena'  are  still 
describable  using  a  language  accessible  to  individuals  and  groups  however  'soft'  or  self- 
referencing.  At  worst  these  are  then  higher-order  language  types  that  are  used  to  describe 
over  'lower-order'  objects,  yet  would  still  be  a  starting  point  for  a  hypothesis. 
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Because  of  this,  typology  around  construct  validity  serves  as  a  starting  position  for  model 
theoretic  descriptions.  Over  metric  spaces,  this  has  interesting  logics  and  ontologies  that  are 
useful  for  systems  methodology.  For  those  objects  that  are  externally  definable,  class 
association  are  described  as  employing  a  "pattern-matching  logic"  (Shadish  et  al  2001).  As 
described  by  Shadish: 


"The  most  common  theory,  each  construct  has  multiple  features,  some 
of  which  are  more  central  than  others  and  so  are  called  [more] 
'prototypical'.  To  take  a  simple  example,  the  prototypical  features  of  a 
tree  are  that  it  is  a  tall,  woody  plant  with  a  distinct  main  stem  or  trunk 
that  lives  for  at  least  3  years  (a  perennial) .  However,  each  of  these 
attributes  is  associated  with  some  degree  of  [uncertainty] .  For 
instance,  height  and  'distinct'  trunk  distinguish  trees  from  shrubs.  But 
what  some  trees  to  shrubs  [have  these  as  ideals] .  [ So  as  described  by 
human  systems],  no  attributes  that  are  foundational  are  foundational. 

Rather  used  a  pattern- matching  logic  to  decide  whether  a  given 
instance  sufficiently  matches  the  prototypical  features  to  warrant 
using  the  category  label,  [more  importantly]  given  other  available 

category  labels." 


One  can  see  where  constructs  can  be  argued  from  a  stance  over  fuzzy  logic.  If  one  is  trying  to 
'assign'  a  set  item  from  a  'fuzzy'  class  of  "tree",  then  one  can  use  fuzzy  logic  for  inference 
potential  and  assign  set  descriptions  to  profiles  describing  the  'fuzzy'  "tree"  form;  e.g.  normal 
curve  around  say  x  height  to  be  tree.  One  would  then  need  reliable  observation  or  theory  for 
the  'tree'  prototype  and  even  then  relational  statements  between  it  and  'shrub'.  For  an 
enterprise,  the  need  would  be  to  assess  the  self-defined  objects  and  the  potential  fuzzy  classes 
for  which  these  have  relationship.  But  these  'theories'  are  then  subject  as  Shadish  notes  that 
the  class  of  "tree"  itself  is  a  fuzzy  notion  in  the  domain  of  the  mind  when  we  attempt  to 
simulate  it  in  the  mental  models.  Now  one  could  continue  ad  infinitum  indexing  fuzzy  object 
relationships  by  making  increasing  order  on  probabilistic  assignments  to  class,  type,  and  object 
to  converge  a  space  (i.e.  a  Bayesian  net  framework).  But  then  one  asks  what  'bounds',  'limits' 
or  similar  topos  property  show  that  the  general  openness  is  reasonably  closable  in  an  instanced 
context.  This  to  show  that  an  enterprise  is  left  with  the  satisfiability  problem  at  the  universal 
even  when  an  instanced  situation  is  has  a  Bayesian  solution.  In  mathematical  model  terms,  the 
openness  in  its  algebraic  and  topological  uncertainty  does  not  necessarily  give  generalizing 
categorical  properties  or  convergent  algorithmic  properties. 

This  then  becomes  an  epistemological  problem  of  satisfiability  over  various  mental  models 
against  the  models  of  any  'technical'  nature.  Latent  considerations  dually  approached  itemized, 
algorithmic  (i.e.  set  theoretic)  responses  by  considering  both  analytic  and  algebraic  objects  as 
was  seen.  There  were  linear  algebraic  descriptions  on  the  response  profile  in  specific 
intelligences,  and  one  then  collects  more  generalized  description  in  models  from  the  inductive 
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support  of  fit  statistics.  Understandably  then  using  statistics  to  measure  first  order  responses  is 
useful  as  they  have  definition  over  the  natural  measure  space  (i.e.  itemized  measure 
themselves),  so  it  is  not  as  if  humans  'invalidate'  the  natural  properties  over  this  'model  order'. 
For  instance,  one  measures  a  pilot  over  a  flight  model  to  measure  pilot  behavior,  and  involving 
a  human  mind  does  not  invalidate  the  model  setup  as  measuring  a  pilot  response  has  strong 
validity  when  comparing  this  behavior  under  the  'flight  model'  constructed  model.  This  may 
seem  unnecessarily  tautological,  but  this  is  what  a  'natural  limit'  or  'boundary'  gives  us.  If  the 
pilot  does  not  execute  under  certain  constraints,  it  is  likely  that  the  plane  will  crash,  and  this 
gives  us  a  universal  to  solve  our  'ordering'  problem.  Additionally,  this  schemata  is  given  a 
priori,  so  one  can  fall  on  reason  even  when  supposing  a  large  space  in  which  behavior  can 
operate. 

Now  consider  extending  the  model  and  reusing  theoretic  descriptions  and  moving  to  a 
generalized  theory.  Now  we  must  deduce  the  constructed  boundaries,  validate  the 
abstractions  used,  and  analyze  whether  these  constructs  relate  to  each  other  under  our 
considered  transformation.  Within  a  'local'  model,  the  item  descriptions  are  deduced 
themselves  and  used  to  induce  up  to  a  general  description.  When  extending  a  model,  it 
'reuses'  the  base  and  analysis,  and  for  parsimony,  a  modeler  has  to  validate  the  description  and 
boundary  involved.  This  is  not  absent  in  other  areas,  but  as  noted,  the  'psycho-social'  space  has 
means  for  'exploding'  the  constructed  bases  as  human  are  order  generators  (increasing 
deductive  potentials)  and  language  generators  (increasing  inductive  potentials).  The  bigger 
problem  is  then  the  setting  of  these  generalized  abstractions  in  'psycho-social'  space  and  their 
validation.  Considering  the  general  domain  then  has  'tuples'  with  the  deductive-inductive 
categorical  bases  (i.e.  a  'valid'  construction).  As  humans  as  a  set  object  can  insert  themselves  in 
varieties  of  constructs  and  spawn  them,  this  becomes  an  increasing  algebraic  problem. 

However,  it  is  natural  then  to  'halt'  our  considerations  under  different  contexts.  When  a  pilot 
goes  into  health  care,  generally  people  consider  this  as  a  different  category  which  we  can 
'project'  what  is  needed  by  the  health  care  'dimension'.  This  'drops'  some  considerations,  but 
as  pointed  out,  this  compression  just  involves  tracing  the  inference  potential  dropped.  A  pilot- 
healthcare  metamodel  is  assumed  to  be  universally  intractable  across  society,  but  we  can 
rationalize  that  we  care  about  particular  information  thus  making  it  tractable.  It  is  then  an 
economic  decision  to  determine  what  is  involved  in  this  tradeoff.  When  this  invalidates  an 
internally  assumed  construct  represented  in  our  language,  the  models  become  complex  under 
composition.  This  is  because  some  information  was  lost  to  the  formal  system  about  the 
boundary  limits.  When  this  occurs,  social  science  terms  it  'dissonant'  compared  to  a  mental 
model  domain. 

The  'halting'  is  done  usefully  by  inserting  a  'state  change'  (or  similar  'cut')  with  the  established 
'pilot'  and  'civilian'  behavior.  This  identifies  an  ordering  on  the  model  space  based  on  the 
mental  model  choice.  However,  then  one  begins  to  divide  the  model  space  for  an  individual 
and  then  the  class  of  'pilot-civilian'  unitary  set.  This  can  still  be  split  by  other  transformations 
or  by  assessing  a  choice  procedure.  The  point  is  then  one  tracks  the  'cross-sectional'  between 
individual  actions  in  the  system  with  the  available  class  descriptions  (both  conscious  and 
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potential  latent  descriptions).  Tracking  then  involves  the  inductive  base,  unit  individuals,  and 
the  unitary  class  independently  as  components  for  validation.  These  bases  are  not  always 
parsimonious,  so  the  model  theory  space  grows.  What  latent  variable  theory  then  tried  to 
notice  is  when  this  expansion  did  not  happen  thus  implying  an  observable  point  (i.e.  anti- 
entropic  phenomena)  that  can  be  assigned  to  a  unique  'human  behavior'.  More  interesting  is 
when  people  change  internal  'states'  either  consciously  or  not  which  presents  an  encrypted 
latency.  So  then  one  has  to  do  'pattern  matching'  across  the  cross-sectional  which  diagonalizes 
the  model  space  and  characterizes  intractability.  Thus,  it  is  not  surprising  that  model 
composition  with  social  and  cognitive  aspects  is  not  well  identified. 

So  considered  from  an  engineering  perspective,  'social  theory'  does  not  always  provide  stable 
unit  descriptions.  For  example,  a  'unit'  in  social  science  may  not  be  as  stable  as  a  unit  in  the 
physical  sciences  like  the  Watt.  Rather  unit  classes  are  available  over  a  given  domain  from  the 
available  language  across  individuals.  So  having  reliable  factors  for  a  'pilot  class'  is  not  as 
available  per  se  as  knowing  that  a  construct  involves  the  'pilot  class'  itself.  This  then  gives  the 
determinable  dimensional  change  and  inductive  bases  involved.  Also  note  all  of  this  without  a 
reasonable  basis  for  these  'states  within  pi lot<->civi lia n'  does  not  have  symmetry  to  the  mental 
models  (i.e.  the  'rational'  human  mind  is  not  universally  parsimonious  to  its  own  behavior).  Nor 
is  there  symmetric  measure  that  totally  configures  under  this  "state  change";  e.g.  providing  a 
brain  scan  as  a  means  for  establishing  symmetry  still  has  an  identification  problem  (and  thus 
does  not  necessarily  satisfy).  This  then  implies  that  particular  measures  are  embedded 
categorical  notions  that  involve  trades  on  scientific  and  model  language. 

This  then  defines  our  discussion  to  consider  the  'construction  problem'  up  to  sufficient  type 
validity  and  the  language  involved.  The  instanced  algebraic  description  and  the  topological 
mapping(s)  invoke  a  protean  categorical  analysis  as  a  model  accumulates  types  and  spatial 
constructions  respectively.  It  is  our  impression  that  enterprise  modelers  have  encountered  this 
for  socio-technical  systems,  and  this  underlies  the  composition  and  consequence  problems. 

The  incorporation  of  SME  impressions  and  'multi-leveled'  modeling  has  defendable  intuition  in 
this  regard  as  it  gives  these  'higher  orders'  rational  constructions.  These  methods  (or  instances 
on  these  methods)  then  must  manage  compression  in  reducing  the  model  space  sufficiently  to 
not  lose  the  aspects  under  consideration.  The  attention  within  social  science  modeling  is 
symmetric  in  discussion  as  it  manages  the  same  attention.  Then  the  suggestion  involves  using 
categories  to  organize  socio-technical  system  modeling  as  it  incorporates  descriptions  that  are 
'social  theoretic'  in  nature. 

This  becomes  more  apparent  as  system  modelers  consider  model  composition  across 
reciprocating  functions  from  observation.  If  assessing  an  abstract  idea  of  'tree'  is  sufficiently 
difficult  to  satisfy,  reciprocating  up  the  idea  of  'plant'  is  even  more,  and  ordinally  more  difficult 
is  the  'idea'  of  'environmentalism'.  This  and  other  similar  '-ilities'  are  within  purview  of  systems 
engineering  and  are  validated  by  theory  only  accessible  by  mental  models.  The  range  on 
enterprise  modeling  extends  up  to  then  an  ordinal  concept  just  for  a  particular  social 
dimension.  This  helps  define  why  objects  are  readily  definable  in  social  theory,  but  their 
modalities  do  not  close  under  collection.  The  representation  remains  an  open  question  within 
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social  science  as  how  to  handle  these  systematic  descriptions  on  behaviors.  And  thus  the  given 
system  representation  having  difficulty  is  understandable.  Section  5  described  the  large  body 
of  work  behind  this  problem,  but  the  concern  here  is  how  to  centralize  the  descriptions.  Ideally 
then  there  should  be  sufficiency  to  mirror  system  modeling  in  both  its  representation, 
simulation,  and  most  importantly  informational  limits. 

Pragmatically  system  modelers  can  take  a  page  from  social  modeling  practitioners.  One  has  to 
fall  back  on  some  realism  if  only  to  have  objects  with  which  to  model  and  usefully  simulate  to 
determine  the  extent  of  their  validity.  Then  one  actively  searches  classes  for  these  objects, 
assesses  their  prototypical  properties,  and  then  successively  types  the  developed  bodies  of 
theory.  Concurrently  these  are  validated  against  the  observed  itemized  descriptions  and 
mental  model  context  through  experimental  design,  higher  order  pattern  matching,  or  causality 
as  an  ordering  (Shadish  et  al  2001).  There  is  a  strong  and  growing  body  of  knowledge  that 
mirrors  soft  system  theoretic  development  as  these  are  creating  conceptual,  generalized 
theoretic  compression. 

An  encountered  idea  that  helps  support  our  'categorical  mapping'  claim  toward  enterprise 
modeling  was  embedded  within  the  construct  validity  literature.  Cronbach  and  Meehl  (1950) 
have  a  classic  paper  where  they  consider  the  prototype  theory  as  a  'nomological  network'.  This 
mirrors  the  graphical  representation  within  latent  variable  theory  where  the  open 
transformations  are  represented  by  network  connections,  but  one  would  recognize  their 
argument  mathematically  as  an  algebraic  network  with  categorical  relationships.  They  argue 
that  these  successive  identifications  in  ordinal  space  creates  a  topological  network  for  the 
underlying  'concrete'  descriptions.  Campbell  and  Fiske  (1959)  relate  this  description  to  a 
diagonalization  process  in  a  measurable  space  in  which  to  assess  these  open  relationships. 
There  is  then  a  formal  stance  that  this  identifies  potential  descriptions  with  increasing  power  in 
the  "network  of  nomologies".  The  description  here  is  that  pattern  symmetry  implies 
congruence  between  algebraic  categories.  Categories  are  used  in  other  areas,  and  the 
argument  here  is  that  the  patterning  over  large-ordering  devices  (i.e.  'human'  in  all  our 
capacity)  involves  constructs  that  have  unique  universal  in  validity  under  different  instances. 
Campbell  and  Fiske  present  a  Multi-trait,  Multi-method  matrix  representation  that  enterprise 
engineers  would  appreciate.  The  effect  is  to  gain  increased  power  on  cross-validation  by 
successively  rediagonalizing  the  space. 

This  then  supports  the  conjecture  that  there  are  inter-  and  extra-  model  properties  that 
determine  the  universal  and  multiversal  validity  respectively  in  sociotechnical  systems. 
'Constructs'  define  an  ever  cascading  indeterminacy  that  is  categorical  in  nature.  Naturally  one 
thinks  to  use  category  theory  as  a  'language'  that  could  serve  as  tracing  'compression'.  This 
does  not  solve  it  per  se  as  in  the  end  the  goal  is  to  provide  objective  output,  but  representing 
theory  in  categories  would  trace  the  'higher-level'  orders  more  clearly.  For  example,  Rosen's 
categories  do  not  eliminate  complexity  but  rather  one  can  identify  the  simplifying  relationships 
easier.  Cronbach  and  Meehl  (1950)  claim  that  this  'categorical  compression'  is  needed  and  its 
potential  (this  author  emphasis): 
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"With  these  statements  of  scientific  methodology  in  mind,  we  return  to 
the  specific  problem  of  construct  validity  as  applied  to  psychological 
tests.  The  preceding  guide  rules  should  reassure  the  "toughminded," 
who  fear  that  allowing  construct  validation  opens  the  door  to 
nonconfirmable  test  claims.  The  answer  is  that  unless  the  network 
makes  contact  with  observations,  and  exhibits  explicit,  public  steps  of 
inference,  construct  validation  cannot  be  claimed.  An  admissible 
psychological  construct  must  be  behavior- relevant  (59,  p.  15).  For 
most  tests  intended  to  measure  constructs,  adeguate  criteria  do  not 
exist.  This  being  the  case,  any  such  tests  have  been  left  unvalidated,  or 
a  finespun  network  of  rationalizations  has  been  offered  as  if  it  were 
validation.  Rationalization  is  not  construct  validation.  One  who  claims 
that  his  test  reflects  a  construct  cannot  maintain  his  claim  in  the  face 
of  recurrent  negative  results  because  these  results  show  that  his 
construct  is  too  loosely  defined  to  yield  verifiable  inferences. 


A  rigorous  (though  perhaps  probabilistic)  chain  of  inference  is 
required  to  establish  a  test  as  a  measure  of  a  construct.  To  validate  a 
claim  that  a  test  measures  a  construct,  a  nomological  net  surrounding 
the  concept  must  exist.  When  a  construct  is  fairly  new,  there  may  be 
few  specifiable  associations  by  which  to  pin  down  the  concept.  As 
research  proceeds,  the  construct  sends  out  roots  in  many  directions, 
which  attach  it  to  more  and  more  facts  or  other  constructs.  Thus  the 
[social  guanta]  has  more  accepted  properties  than  the  [physical 
guanta]:  numerical  [properties  imply]  more  than  the  second  order 

factor  space." 


However  given  the  intuition  that  categories  are  congruent  in  form  to  a  'nomological  network' 
allows  one  to  notate  the  'definitiveness'  on  a  construct  network  by  assessing  the  "finespun 
network  of  rationalizations".  So  these  "chains  of  inference"  are  then  congruent  to  a  particular 
'universal  functor'  in  which  to  then  define  a  category.  The  potential  benefit  is  that  if  so-called 
'nomology  logics'  are  available  that  these  would  inherit  categorical  theorems  such  that  one  can 
test  these  both  ways:  the  numerical  'small  category'  from  deduced  observables  and  mental 
model  'large  category'  from  induced  bases.  This  then  potentially  guides  the  aspects  to 
identifying  unintended  consequences  as  one  would  like  to  know  how  these  'large  category' 
constructs  insert  into  'small  category'  system  (various  mental  models  potentially  effect  a 
system  of  interest)  and  how  then  'small  categories'  might  be  constructed  that  are  parsimonious 
to  encountered  'large  category'  abstractions  (identify  the  consequences  from  previous 
validated  models). 

These  then  imply  via  use  of  categories  that  algebraic  properties  must  be  matched  ('commute' 
across  all  large  and  small  involved).  Otherwise  there  will  be  multiple  types  of  'model 
bifurcation  error':  invoking  a  system  under  unjustified  mental  construct,  invoking  a  mental 
construct  that  does  not  present  in  the  system,  extending  a  system  without  symmetry  in  the 
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extension  of  the  mental  construct,  and  likewise  extending  a  construct  without  parsimonious 
system  model  composition.  These  are  then  'dissonance'  issues  between  the  observable  world 
and  the  mental  models.  But  system  observation  could  be  compressed  to  a  'large  category',  and 
these  serve  as  a  class-type  language  in  which  to  simulate  and  compose  measurable  models  to 
validate  at  instanced  'levels'.  While  this  invokes  the  intractable  'algebraic  problem',  this  allows 
more  dimensionality  on  enterprise  modeling  that  can  be  assessed.  Information  theory's  ordinal 
numbering  shows  that  some  compression  is  irreducible,  but  also  helps  identify  where  programs 
are  self-delimiting.  Additionally,  this  seemed  more  'natural'  to  the  analysis  observed  in 
psychological  and  social  sciences,  so  would  be  parsimonious  in  categorical  language.  For 
example  the  IQ  theoretic  statements  defined  and  shown  measures  over  the  'open  set'  space  on 
a  topology.  Then  by  its  theory  the  measure  had  difficulty  extending  over  any  'closed  set'  space: 
item  by  item  ordering  and  individual  by  individual  ordering  both  of  which  have  a  denumerable 
setting.  Where  IQ  was  valid  was  over  group  orderings,  it  is  conjectured  that  it  is  due  to  the 
openness  invoked  by  the  groups  and  opens  the  simulation  space.  However,  this  itself  lets 
modelers  know  a  priori  that  there  is  limited  'extendibility'  in  evaluating  sub-group  systems,  so 
there  is  still  'information'  in  some  fashion  even  if  in  its  architecture. 


7.2.2  Classes  and  Types  in  Nomologic  Model  Theory 

When  attempting  to  build  a  nomologic  network,  one  deals  with  multiple  potential  orderings 
embedded  in  a  particular  set-type-class  grouping?  So  how  does  one  compose  a  set  ontology  in 
parsimonious  way?  As  Cronbach  and  Meehl  (1950)  noted,  this  does  not  make  any  set  language 
always  unsatisfiable  as  one  can  construct  using  increasing  power  setting,  but  then  tracing  these 
through  the  openness  becomes  increasingly  difficult  to  get  'validate-able'  descriptions  in  the 
system  of  interest.  The  open  structures  then  involve  complicatedness  (given  a  setting  is 
transfinite)  and  complex  (given  otherwise).  Then  the  network  representation  is  necessary  to 
maintain  the  extent  on  proper  orderings  that  are  not  immediately  denumerable  (implying  a 
settable  linguistic). 

Ideally,  there  exists  a  definitive  ordering  constraint  at  some  'level'  in  the  system  or  mental 
model.  Provided  that  a  modeler  can  use  a  higher  ordering  assumption  (e.g.  multi-leveling, 
'ontological'  classes,  and  other  set  ordering),  he  or  she  can  insert  a  'construction'  that  then 
needs  to  be  justified,  identified  and  validated.  Otherwise,  for  computability,  it  must  be 
embedded  in  a  real  measurable  space  which  can  compress  to  dimensional  patterning,  but  then 
the  potentially  useful  orderings  for  system  simulation  become  hidden  (i.e.  embedded  in  the 
ordinal  sets).  Again,  if  there  is  a  transfinite  order,  there  is  potential  higher  ordering  that  can  be 
set  by  aligning  mental  model  descriptions  to  a  type  hierarchy  and  the  resulting  language  which 
a  system  can  be  based.  But  from  work  on  the  continuum  hypothesis,  this  is  not  'accessible' 
purely  by  denumerable  systems  (i.e.  using  itemized  objects).  Then  in  the  spirit  of  information 
theory,  one  would  like  to  compress  it  to  a  transfinite  type  hierarchy  maintaining  the  categorical 
'construct  key'  for  the  formal  system.  Then  under  composition,  a  modeler  would  like  to  assess 
the  compression  to  provide  a  fuller  theory  on  the  higher-order  types  and  maintain  a  spatial 
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mapping  like  a  network  for  these  nomologies,  which  as  assessed  in  Section  6  was  the  intuition 
on  conceptual  models. 

The  fundamental  transformations  in  Section  6  were  'reversing'  by  giving  a  denumerable  basis 
and  assessing  using  open  sets  giving  an  analytic  basis.  As  we  compose  these,  the  types  make 
for  increasing  abstraction.  In  social  environments,  this  can  lead  to  non-symmetric 
representation  without  clear  mental  models  or  convergent  public  choice.  Note  then  this  does 
not  even  make  an  instanced  compression  per  se  'wrong'  as  the  social  group  might  have  sets 
and  orderings  dissonant  to  'their'  itemization.  For  example  different  financial  system  models 
can  be  'dissonant'  to  each  other  depending  on  the  underlying  stance  such  as  derivative  and 
value-based.  These  can  show  divergence  from  each  other,  but  we  do  not  think  these 
definitively  mean  these  are  'wrong'  purely  against  each  other.  One  can  observe  that  derivative 
captures  the  'openness'  induced  by  'hedging'  and  conversely  'value'  on  the  open  'utility'.  So 
then  purely  rationalizing  over  the  large  abstractions  is  not  sufficiently,  universally  valid  under  its 
own  system.  These  are  simply  compressions  which  require  validation  to  their  quotient,  the 
composition  operant  has  a  valid  isomorphism,  and  their  'large  categories'  stay  homeomorphic 
to  the  system  of  interest.  The  point  is  this  'strong'  validation  requires  an  intense  power  setting 
to  achieve  otherwise  there  is  limited  inference  from  epistemic  validity  notions  under  just  the 
formal  system  or  mental  models  independently. 

As  a  validation,  the  system  of  interest  is  either  tautological  in  its  reality,  under  experimental 
designs  or  in  situ  (under  defined,  normalized  context).  This  means  any  extensions  have  to  have 
proper  unions  in  their  set  measures  and  then  at  (some  level)  provide  a  transfinite  ordering  to 
set  its  presented  output.  The  financial  engineering  literature  serves  as  a  prime  example  as  the 
theory  has  inescapable  realities  based  on  categorical  properties  inherent  in  utility  theory. 
Brownian  motion,  contractual  agent  responses,  etc.  These  classes  might  not  be  'true'  in  totality 
(presumably  not  all  human  constructs  are  hedging  and  pricing  behavior),  but  serves  as  an  'ideal' 
compression  as  there  is  then  a  proper,  traceable  categorical  class  on  all  behavioral  quotients. 
This  then  allows  greater  setting  with  which  to  itemize,  and  then  yields  more  powerful 
observation  as  there  is  a  clear  class  ordering  and  known  valid  transformations  implied  by  its 
nomology.  One  can  use  a  transfinite  inductive  procedure  to  support  model-theory 
constructions  and  again  show  traceability  and  satisfiability  to  the  system  of  interest  "as 
categorical  defined".  If  the  settings  prove  proper,  then  our  nomologic  network  becomes  a 
proto-type  algebra,  and  if  properness  is  not  available,  then  there  are  clear  modal  statements 
with  which  to  test  further. 

In  the  meantime,  one  needs  to  trace  the  complexity  such  that  the  variety  on  ordering,  class, 
and  types  can  be  'stored'  and  'searched'  in  an  efficient  manner.  The  search  is  to  find  potential 
transfinite  compression  allowing  universal  closure,  and  this  while  'storing'  spaces  provided  by 
constructs  previously  identified.  The  theorems  over  categories  and  extensions  serve  these 
purposes  given  the  topologic  algebra  being  invoked  by  these  compression  types.  Then  one 
considers  how  these  model  and  theory  statements  are  to  be  'stored'  in  a  settable  manner. 
There  is  an  underlying  'inconsistency'  as  how  does  one  set  an  unsettable  phenomena,  but  one 
could  number  by  the  ordinals  themselves  as  a  means  for  delimiting  the  constructs  themselves 
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when  assessing  the  mental  models  (rational  hypotheses  on  the  "surrounding  network"  in  a 
space).  Below  then  are  then  identified  modal  properties  implied  by  two  identified  axiomatic  set 
theories. 

Tiles  presents  a  good  assessment  and  representation  to  visit  regarding  the  constructability  on 
these  sets  (Tiles  2004).  The  investigations  into  axiomatic  logics  using  Cantor's  and  Hilbert's 
programs  gave  strong  power  to  abstractly  construct  settings  and  trace  transfinite  orders 
respectively.  The  centrality  was  not  accounting  for  the  wealth  of  'settable  things'  but  rather 
the  growing  power  set  on  their  potential  orderings.  There  is  shared  intuition  here  as  given  the 
self-creating  and  spontaneous  orderings  in  social  systems  one  is  likewise  not  as  concerned  by 
the  set  items  but  rather  the  wealth  on  transformative  orderings.  Godel's  theorems  on 
satisfiability  then  seem  intuitive  that  the  'construction  procedure'  by  Zermelo  et  al  does  not 
allow  for  'accessing'  these  increasingly  open  ordinals;  again  not  impossible  but  'inconsistent' 
internally  to  the  models  set  in  this  way.  So  this  colloquially  constructs  a  model  space  that 
cannot  have  a  single  universal  setting  to  switch  between  a  mental  model  and  the  world  as  it  is 
("rationalization  is  not  construct  validation").  Using  this  as  an  analogy  to  the  ordering  within  a 
social  system,  it  should  present  the  difficulty  in  getting  to  a  sufficient  'social  theory'  using 
purely  denumerable,  set  statements  and  conversely  ordinals  themselves  do  not  provide 
statements  that  are  'validate-able'  in  a  formal  (denumerable)  system. 

The  intuitive  consequence  to  using  these  together  is  the  ability  to  capture  computational 
systems  and  patterning  independently.  Then,  as  discussed,  one  can  program  in  functional 
transformations  on  the  models  under  a  bounded  domain.  This  is  complex  and  thanks  to 
impossibility  we  lose  certainty  in  our  transfinite  setting  on  our  control  system.  However,  in 
social  measure  theory,  there  is  accessible  measure  here  which  implies  some  constraint  on  the 
transfinite,  social  space.  One  final  appeal,  their  approach  has  been  to  define  through  small  and 
large  categories  the  extent  to  which  these  can  be  defined  constructively  or  openly  respectively. 
Interesting  results  in  category  theory  are  under  defined  universal  relationships  ('functors')  using 
that  can  extend  the  small  and  large  categories  ('Kan  extension')  which  under  the  social  theory 
would  be  synonymous  with  the  "spun  network"  (i.e.  extension)  of  'constructs'  (sub-universal 
functors).  As  these  are  algebraically  constructed,  this  allows  for  transfinite  constructions  under 
sufficient  identification  over  some  meta-properties.  Something  systems  engineering  would  like 
to  bring  to  these  'socio-technical'  spaces! 

Consequently,  the  goal  is  'two-sided'  as  any  analysis  yields  loading  open  ordinal  sets  and 
'proper'-ness  at  some  spatial  location.  This  then  seems  natural  to  identify  systems  up  to 
morphism  to  some  available  small  category  that  preserves  the  large  categories  involved.  This 
allows  a  clear  procedure  that  separates  the  denumerable  expectation  from  the  open  construct 
being  claimed.  This  is  not  new  in  systems  engineering  as  agent  based  modeling  involves  social 
network  theory  implemented  often  using  a  topology.  This  is  usually  done  by  'constructing'  the 
openness  as  'social  relationship',  a  large  category,  while  seeking  to  present  denumerable  agent 
responses,  captured  in  a  sufficiently  small  category,  against  each  other.  Then  one  constructs 
smaller  categories  out  of  an  observable  large  pattern,  and  for  a  general  (denumerable)  theory 
in  this  area,  one  would  need  to  show  a  definable  transfinite  order  up  these  'categorical  layers'. 
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What  is  needed  is  to  extend  or  simulate  these  categories  showing  that  the  social  space  is  then 
valid  across  these  modalities  in  a  system  of  interest. 

7.2.3  Transfinite  Ideal  Categories 

A  systems  engineer  would  then  like  to  develop  these  large  and  small  categories  to  identify 
either  a  space  or  language  respectively.  One  step  would  be  to  identify  a  basis  and  unit 
respectively,  but  by  definition  (or  rather  by  impossibility  theorem),  these  are  not  (sufficiently, 
completely)  available.  So  then  rationalizing  via  categories  these  to  an  'ideal'  construction  is 
needed  to  fit  a  particular  purpose.  This  ideal  would  then  ideally  be  developed  such  that  all 
large  categories  are  an  ordering  that  are  homeomorphic,  large  to  small  category  transformation 
are  homomorphic,  and  small  categories  have  proper  classing  (implying  a  set  ordering)  thus 
giving  an  executable  architecture.  While  this  implies  choice  ordering  itself,  which  may  by 
extension  might  be  moving  the  problem,  the  intent  is  to  take  the  intuitive  form  on  Turing's 
logical  program  (Appel  2014)  but  applied  to  functor  satisfiability  rather  than  denumerable 
setting. 

The  page  taken  from  Turing  is  to  justify  an  algebraic  approach  rather  than  trying  to  pose  proper 
orderings.  His  was  an  attempt  at  resolving  Godel's  paradox  by  defining  ordinal  logic  such  that  it 
bounded  the  space  that  gave  numerical  construction  allowing  valid  replication  (his  concern  was 
mechanical  replication).  Similarly  one  can  ask  by  analogy,  how  does  one  transfinitely  order  a 
categorical  hierarchy  in  a  language  such  as  UML/SysML  type-class  hierarchy  allowing  valid 
replication  by  an  open,  'social'  system?  If  this  has  a  bijection,  then  the  open  system  can 
naturally  output  in  this  language  without  loss;  so  this  'ideal  model  language'  can  both  express 
and  capture  theory.  As  UML/SysML  has  background  in  specifying  formal  systems,  the  opposite 
domain  to  formal  systems  (i.e.  the  social  system)  is  not  parsimonious,  so  then  pure 
compression  by  a  single  language  does  not  universally  satisfy  a  particular  conceptual  network. 
Humans  as  ordering  devices  show  more  than  a  single  categorical  language,  so  the  conjecture  is 
that  a  single  language  is  not  even  stable.  As  latent  variables  defend  the  stance  that  psycho¬ 
social  systems  have  ordinal  modular  properties,  the  need  is  to  define  the  modules  up  to  large 
categorical  constructs  rather  than  the  small  categorical  type  module.  So  then  engineers  should 
be  motivated  to  identify  an  open  language  rather  than  a  single  set  linguistic  to  provide  a  base 
for  formal  'socio-technical'  systems. 

Since  this  is  an  idealized  discussion,  some  justificatoin  is  needed.  Appeals  will  involve  classic 
results  in  logic,  set  theory,  and  extension  to  categories,  and  then  these  are  used  to  conjecture  a 
hypothesis  on  an  'ideal  language'  for  constructs  implied  by  the  surrounding  discussions.  These 
should  not  be  taken  as  sufficient  from  theorem  (as  this  is  an  intuitive  construction)  nor  in 
theory  (as  this  is  a  schematic  impression  over  general  latent  variables).  The  appeals  will  stick  to 
the  axiomatic  systems  in  Zermelo-Frankel+Choice  set  theory  (ZFC)  and  those  that  allow 
'spawning'  of  classes  at  open  ordinals  in  VonNeumann-Godels-Bernays  set  theory  (NGB).  These 
matched  with  available  structural  transformation  relationships  implied  by  overlaying  category 
theory  allow  for  enough  ordinal  space  concerning  the  review  encountered. 
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Tiles  provides  a  descriptive  diagram  showing  the  intuitively  constructed  space.  Using  recursion, 
one  covers  ordinals  growing  a  denumerable  model.  Working  from  agents  as  ordering  functions, 
this  is  necessary  as  this  would  be  an  object  at  some  ordinal  greater  than  the  zeroth  ordinal  if 
their  behavior  is  to  be  captured.  The  increasing  space  allows  for  greater  'ordinal  combinatorics' 
that  can  be  numbered  by  omega: 


A  'constructive  theory'  is  then  defined  by  traditional  ZFC  construction  on  a  'model'  defined 
over  some  'proofing  system'  which  serves  to  provide  which  elements  are  valid  under  previous 
ordinal.  Then  a  model  at  an  'alpha-level'  has  'minimal  model'  that  is  strictly  consistent  within 
ZFC.  That  space  which  represents  those  'constructive'  statements  which  are  potentially 
independent  from  the  minimal  model  which  can  insert  a  function  in  the  'proofing  system'  that 
is  given  from  outside  analysis  (i.e.  a  'forcing'  function).  The  'axiom  of  constructability'  seems  a 
natural  forcing  as  assessed  from  the  mereologic  validity  encountered  in  the  LVT  conceptual 
review.  Flere  the  'well-founded  sets'  are  defined  by  a  VonNeumann  universe  allowing  those 
statements  constructible  in  a  transfinite  hierarchy.  Finally  'cumulative  limit'  defines  the 
additional  proper  classes  that  (potentially)  exist  which  are  not  'nicely  constructible',  hence 
suffer  indeterminate  construct  validity  (e.g.  construct  fallacies  that  needed  increasing 
diagonalization)  in  our  system.  The  choice  on  NGB  for  this  'space'  allows  many  of  these 
statements  to  be  specified  on  their  own  potentially  independently  of  the  formal  ZFC  model.  It 
then  fits  the  ideal  that  it  has  representation  in  a  large  category  and  available  complex 
transformations  therein  (Muller  2001). 
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such  that  V  =  L 


•a 


'cumulative  limit',  such  that  V  /=  L  -  'VonNeumann  Hierarchy' 

Figure  16  -  Hierarchy  Settings  (adapted  from  Tiles  2004) 


The  conjecture  is  this  provides  a  nice  capture  on  the  nomological  network  within  construct 
validity  as  defining  NGB  classes  around  the  constructability  allows  the  open  mereology 
observed  in  psycho-social  measure.  This  allows  the  'model  typing'  that  is  needed  for  "testable 
rationalization"  within  a  formal  ZFC  framework  as  there  is  the  shared,  nicely  termed 
'constructible  universe'.  Since  'unintended  consequences'  were  self-referencing  definitions  on 
the  constructability,  this  framework  is  symmetric  and  synonymous  given  the  inconsistency  in 
NGB  allowing  universal  classes.  There  are  then  large  categorical  properties  that  can  be  loaded 
as  ZFC  has  been  shown  to  have  an  interface  to  topologically  concerned  modal  logics;  Morse- 
Kelly  might  be  of  interest  from  the  topological  analysis  in  'social'  and  'conscious'  portions  of 
theory  (Kelley  1975).  Then  computational  methods  would  be  available  given  enough 
identification  on  these  class  compressions  such  that  ZFC  representation  allows  entrance  on 
recent  computational  social  science  methods  (Conte  et  al  2012).  The  implied  hierarchical 
nature  also  mirrors  the  prototypical  approaches  in  complex,  post-modern  classification 
approaches  (Harvey  &  Reed  1996). 

Specific  'niceties'  on  this  interactive  language  allows  those  observations  that  as  Godel  notes  "is 
either  too  big  for  a  machine  or  too  small  to  be  encapsulated  by  all  of  math"  in  some  notational 
fashion.  The  set  theoretic  categories  are  then  not  a  'solution  in  itself'  as  these  two  are 
'inconsistent,'  but  rather  a  powerful  enough  language  which  to  represent  abstractions.  Then 
architecture  in  this  manner  would  have  natural  'modal  types': 
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Open  class  relationships  invoked 
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The  later  may  be  unnecessary  if  this  'social  construct'  is  transfinitely  settable  in  situ;  e.g. 
classical  economics  shows  consistency  in  certain  contexts.  This  is  shown  by  having  a  'V- 
technical  quotient'  that  provides  a  clear  order  in  which  to  load  a  technical  system.  This 
presents  a  natural  transformation  that  allows  a  VonNeumann-Morgenstern  ordering  to  have 
strong  epistemic  validity  given  the  proper  classes  are  involved. 

The  dialectic  implied  by  ZFC  &  NGB  inconsistencies  then  could  represent  the  post-modern 
difficulties  with  social  science,  and  why  behavioral  insertion  in  economics  have  had  difficulties. 
This  is  because  capturing  L  as  an  independently  choiced)  object  is  a  difficulty  but  identified  as 
needed  within  the  studies.  In  other  sciences,  this  is  less  trivial  as  the  hypothesis  method  lets  us 
infer  'L-language  limit'  and  find  the  supporting  evidence  for  the  'N-constructed  universe'  more 
straightforwardly  (although  in  social  science  the  N-'universe'  is  more  accessible).  Then  one  can 
define  'M-model'  up  to  known  'N-universe',  and  this  leverageable  theory  is  'nicely  computable' 
as  it  is  congruent  to  ZFC  set  theory.  Yet  when  studying  and  imposing  'L-language  limit'  over  a 
language  generator  something  we  hope  is  a  given  'human  nature',  this  appears  as  human 
'adapting'  that  this  breaks  its  assumed  limits  in  an  instanced  theory.  This  implies  that 
unintended  consequences  are  a  result  from  a  formal  systems  language  choice!  This 
encapsulates  the  classic  study  problems  such  as  self-fulfilling  prophecies  as  being  defined  by 
this  'L-language'  being  extended  and  injected  on  'N-universe'  which  would  appear  under 
measure  as  circular  observations.  Then  the  language  device  humans  as  a  'L(N)-language 
function'  needs  more  ordinal  space  to  be  sufficiently  covered.  A  'L*-recursed  language'  could 
be  constructed  within  NGB  and  then  assessed  for  how  symmetric  this  is  to  the  current  ZFC 
model.  This  'dissonance'  pseudo-measure  could  show  the  'higher-order'  phenomena  in  the 
NGB  model.  But  the  independent-ness  between  the  two  representations  does  not  necessarily 
demand  a  formal  system  as  public  choice  may  determine  this;  e.g.  engineers  would  not  want  to 
change  entire  aerospace  development  models  instantaneously  just  because  people  imagined 
an  unvalidated  mental  model  for  planes.  This  is  then  just  a  recursion  on  the  cumulative 
potential  in  the  population  mental  models  which  might  be  ultimately  'incomplete'  in  any  well- 
formed  'N-universe'  at  ordinal  level.  But  as  human  can  carry  on  well  enough  without  being 
well-formed  in  their  mental  models,  this  is  then  just  independently  'dissonant'  and  requires 
additional  steps  to  establish  either's  validity. 

There  could  be  new  constructions  by  trying  to  develop  both  system  models  in  both  ZFC  and 
NGB  settings.  This  then  allows  a  'soft'  ordering  in  our  choice  defining  the  epistemology  with 
ZFC  and  ontology  with  NGB  as  one  progresses  in  small  and  large  category  respectively.  This  can 
be  assessed  in  social  measure  theory  as  the  EFA  +  CFA  =>  PCA  seemed  to  imply  categorical 
extendibility.  Even  more  interesting,  this  might  not  be  a  problem  as  one  could  constrain  the  L* 
in  NGB  via  a  topological  limit  L*->L  (ie  pattern  or  topos)  under  some  higher-order  procedure. 
This  would  set  the  classes  in  NGB  and  be  transfinitely  bounded  in  ZFC  giving  a  constructible 
metric  framework  up  to  an  ordinal  assignment.  The  categorical  and  topological  papers  implied 
this  either  by  intent  or  by  the  underlying  abstract  mathematical  construction.  This  might  be 
(efficiently)  incomputable  by  man  or  machine,  but  'constructible'  in  the  model  theoretic  sense 
such  that  socio-technical  systems  can  have  transfinite  extensions  despite  technical  constraints 
in  any  instance.  Yet  this  would  still  be  indexed  dually  by  'tuples'  over  the  effective  'aleph- 
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classes'  consumed  in  NGB  before  getting  to  'nicely'  modellable  ZFC  which  is  the  central  to  the 
point  to  construct  validity  issues. 

Then  finding  the  uniqueness  over  NGB  'construct  classes'  is  the  inescapable  epistemological 
solution(s).  There  is  typing  over  'construct  categories'  by  their  ordinal  numbering  in  a 
'nomology  network'.  While  neurology  or  other  biological  extensions  can  'build  up'  classes 
under  NGB  as  well  (hence  the  potential  relevance  and  symmetry  on  these  models  and 
measures),  this  is  only  under  such  that  N  with  the  inference  potential  N-'morphic'->(L,  L*)  (i.e. 
there  is  a  shared  construct  language).  Now  this  can  converge  by  adding  analogies,  but  one 
must  use  a  'two-sidedness':  validate  a  defined  N-structure,  observe  changes  in  N,  and  monitor 
the  adjoint  under  an  ordered  extension.  This  is  covered  by:  ZFC  for  the  instanced  hypothetical 
validity,  NGB  which  could  assess  whether  classes  converge  &  closes  on  a  construct,  and  their 
relationship  shows  this  'strong'  epistemic  validity.  If  this  is  identified  up  to  an  ideal  'order¬ 
preserving  isomorphism',  then  one  has  a  full  quotient  for  a  V-technical  space  allowing  a  set 
system  language  to  be  used  ubiquitously. 

This  seems  the  entrance  to  the  Rosen  categories  in  measure  theory  as  one  deals  with  these 
class  constructions,  but  needs  categorical  relations  in  its  meters  within  the  'well-constructed' 
system.  In  human  inference,  the  importance  seems  reversed  as  possible  classes  outpace 
everything  in  which  case  the  Rosen  categories  can  be  an  easier  compression  on  'measures  of 
interest'.  This  appears  to  one  used  to  ZF(C)  as  terribly  invoking  class  changes  (likely  why  social 
theory  and  construct  validity  can  appear  to  overly  dismiss  physical  evidence),  but  under  NGB, 
these  can  be  well  founded  with  hypotheses  on  their  logical  ability  to  split  questions.  In  formal 
representative  theory  under  denumerable  sets  (e.g.  Newton's  Three  Laws),  this  starts  to 
become  a  questionable  intuitive  approach  over  purely  social  enterprises  as  this  leads  to  an 
overly  constrained  enterprise.  This  then  conjectures  possible  trading  on  the  class  changes,  and 
could  use  NGB  classes  previously  identified  to  construct  a  [system,  metric]  pair  to  assess.  This 
could  give  better  assessment  on  operationalizing  more  'general  theory':  Psychoanalytic, 

Gestalt,  Psychodynamic,  and  the  broad  range  on  'schools  of  thought'. 

Finally,  we  consider  the  'spawning'  potential  involved  in  the  categorical  extension.  Economics 
has  found  interesting  extensions  in  biology,  physics,  engineering,  and  social  behavior.  This  is 
likely  because  if  one  has  an  ordered  relationship  mirroring  a  VonNeumann-Morgenstern  utility 
space  this  provides  a  numerical  basis  on  'tradeable  objects'.  This  could  be  used  in  categories  to 
define  over  'utility  domain'  that  fits  an  'economics  category'  as  a  construct.  These  are 
constructible  in  ZFC  as  these  define  a  VonNeumann  universe  ordered  by  'economic  constraint' 
functor  that  could  be  applied  over  any  ZFC-'model  objects'  under  NGB-'utility  class'.  Given  a 
defined  pattern  between  NGB->ZFC,  this  intermediate  category  could  be  extended  over  several 
varieties  that  humans  are  involved  under  different  orderings;  applying  economic  games  as 
ordering  different  'levels'  in  a  system.  This  also  implies  the  wealth  of  economics  less  as  a 
universal  truth,  but  rather  a  regular  pattern  with  which  humans  categorize  so  could  provide 
extensions  for  behavioral  economic  encapsulation.  Presumably  in  a  formalized  language,  this 
could  hope  to  find  similar  categories  that  relate  a  broad  range  of  behaviors  by  assessing 
generalized  theorem  with  different  systematic  bases.  Then  this  would  be  defined  by  having 
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repeatable  class  compression  with  a  denumerable  basis  whenever  valid  by  transferring  the 
class  pattern. 

The  goal  then  is  to  provide  a  means  for  formally  expressing  theory  such  that  is  contains  both 
the  valid  results  (i.e.  logics)  but  more  importantly  ability  to  mirror  the  appearance  of 
'unintended-ness'  in  the  results.  By  invoking  both  ZFC  and  NGB  axiomatic  sets,  this  allows  more 
descriptive  power  although  not  always  a  clear  programmatic  instance.  Investigating  how  this 
systematic  setup  appears  is  at  the  heart  of  information  theory  in  complex  systems.  The  core 
observation  is  on  replicating  inconsistencies  such  that  the  logical  setting  internalizes  but  also 
gives  resolution  with  a  particular  choice  or  proofing  system.  What  would  be  interesting  for 
systems  engineering  is  the  extent  to  how  different  independent  axiomatic  systems  might  be 
applied.  For  this  case,  this  allows  initially  this  'separation'  effect  between  observable  models 
(ZFC)  and  human  constructible  models  (NGB)  and  forces  their  interaction  in  a  way  that 
replicates  the  validity  concerns  in  social  measures.  Categories  are  then  useful  as  a  means  for 
maintaining  order  internally  and  communicating  the  systems  to  each  other.  It  is  natural  then  in 
category  theory  to  use  small  categories  to  show  ZFC,  large  categories  for  NGB,  and  identify 
functors  that  provide  a  map  from  objects  in  one  to  appropriate  epistemology  in  the  other. 

Category's  underlying  representation  in  graphs  allows  the  use  by  commutative  diagrams.  This 
also  might  transfer  to  patterning  potential  in  human  mental  models  as  (Trochim  &  Linton  1986) 
shows  an  example  of  measuring  graphical  patterning.  These  in  combination  might  allow 
measuring  human  patterns  which  can  then  be  used  as  a  (semi-)automata  from  ordinal 
observations  from  potential  latent  profiles.  The  graphical  representation  would  have 
databasing  potential  for  itself  representation,  and  as  again  (Spivak  &  Kent  2012)  has  shown, 
there  is  a  direct  relational  databasing  output  possible  from  sufficiently  define  categories.  This 
would  have  the  potential,  given  sufficient  definition,  to  aid  in  decision  making  and  visual  aiding. 
The  algebraic  underpinning  could  also  serve  to  'solve'  social  systems  in  a  more  abstract  manner 
as  at  any  ordinal  level  any  investigation  could  output  to  the  closest  "social  norm"  (Nyborg  et  al 
2016)  identified  by  the  'closest  large  category'. 

Then  begs  the  question  as  to  how  to  create  an  ordinal  topology  sufficient  for  this  space  and 
how  one  'loads'  the  database  to  maintain  the  objects  across  it  (none  of  which  is  a  small  feat). 
Using  our  diagram  from  Tiles,  we  would  'copy'  the  image  of  the  ordinal  space  such  that  choice 
ordering  could  be  placed  on  the  classed  'left-side'  and  choice  ordered  object  model  set  on  the 
'right-side'.  Then  one  would  look  for  a  functor  or  morphism  that  preserves  the  aspect(s)  of  the 
construct  that  one  cares  about:  the  invoked  class  ordering  to  the  available  ordering.  This  then 
allows  a  triple  such  that  a  construct  is  defined  by  its  topos  organization,  functor,  and  itemized 
set. 

The  continuum  hypothesis  allows  ZFC  to  reach  a  defined  ordinal  from  power  setting  up  to  a 
particular  aleph  space  (i.e.  a  'bottom-up'  architecture  construction).  Conversely  class  ordering 
can  provide  decomposition  rules  (e.g.  'top-down'  functional  architecture)  allows  for 
increasingly  powerful  NGB  spatial  identification.  These  ideally  are  identified  up  to  an 
isomorphism  for  any  intermediate  ordinal  space  which  would  yield  a  'nice',  'computable'  theory 
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for  any  social  construct.  Likely  this  is  not  guaranteed,  so  one  might  identify  weaker  morphisms 
which  still  provides  a  'usable',  compressed  theory.  However  each  gives  encapsulated  objects 
which  to  have  reducible  relations  in  the  category  of  sets  with  algebraic  functions  which  can  be 
simulated.  This  could  yield  (semi-)'nice'  sets  of  theory  which  would  have  propositions  in  both 
'technical'  and  'social'  space.  These  would  still  be  complex  but  one  could  'trace'  observations 
by  their  'n-tuples'  of  categorical  choices.  Then  with  any  settable  proper  classes,  one  can  again 
use  Turing  logical  architecture  against  the  modeling  language  themselves.  Loading  a  generative 
automata  from  a  sufficiently  defined  self-limited  language  (UML/SysML),  and  more  importantly 
modulate  it  so  that  it  knowingly  halts  when  no  longer  in  its  construct  domain! 


The  hypothesis  is  that  this  dual  representational  has  power  described  above  for  'socio- 
technical'  systems.  As  a  short  patterning  we  show  this  over  three  arbitrary  levels  in  a  model 
architecture  (Figure  17)  as  a  first  conjecture  as  to  cover  the  purely  algebraic  typing  at  level  (i-1), 
purely  topologic  typing  at  level  (i+1),  and  the  potential  mixture  modeling  at  level  (i): 


L  "LGB  Classes",  M  "ZFC  Model" 

72  "choice  at  omega" 

'ordinal  limit  for  Model  at  aleph' 

2*  "realization",  'homeomorphism" 

/ - »  'decomposition  relationship' 


Figure  17  -  Hypothesized  dual  representation 


This  yields  a  type  hierarchy  by  conjecture  over  the  constructability  on  the  sets.  These 
constructions  as  mereologics  would  then  define  a  ordering  for  the  system  which  could  be 
assessed  by  simulating  statistical  social  measures.  This  would  also  imply  an  'ordinal  confidence' 
as  one  'moves  up'  the  hierarchy  which  would  allow  a  quasi-quotient  metric  for  the  constructs 
(i.e.  the  more  mereologic  specification,  the  more  potential  construct  validity  issues  invoked). 
Likewise  shaping  statements  such  as  '-ilities'  specification  could  be  represented  by  their  class 
construction  assessed  in  variety  of  instances  that  are  validated  as  representative:  by 
continuously  assessing  set  SME  statements,  validated  public  choice  ordering,  and  observing 
latent  profiles  in  action.  Given  a  sufficiently  large  library,  then  one  could  identify  distancing 
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over  the  class  compressions  from  various  sources  then  useful  abstraction  in  aspects  in  socio- 
technical  systems.  For  example  it  would  be  useful  to  assess  maintainability  as  a  defined 
modellable  metric  (ZFC),  assess  that  metrics  serve  as  useful  measure  (SME/public  assessment), 
this  remains  an  ordered  scheme  (NGB),  and  searching  additional  schematic  notions  to  define 
appropriate  'ontology'  for  an  implementing  architecture  (choicing  a  M-'model  hierarchy').  This 
could  be  specified  in  various  orders  via  model  assignment  {a},  open  set  decomposition  (r),  and 
specification  at  ordinal  level  <n>.  Hopefully  this  could  serve  as  an  ordered  trace  on  any  systems 
modeling  procedure  specified  by  constraint  type,  ordinality,  and  construct  domain. 

The  hope  is  that  this  serves  as  a  relatively  simple  category  to  type  organize  the  socio-technical 
language  but  more  importantly  identify  and  cross-validate  categories.  As  Godel  et  al  were 
concerned  with  intuitional  organization,  this  might  even  prove  to  assist  in  transmitting  better 
discourse  on  these  open,  intuitional  aspects  in  enterprises.  If  the  more  open  and  dissonant 
aspects  can  be  formed  into  a  similar  language,  this  might  also  aid  in  being  a  (semi-)automatic 
typing  itself  by  again  naturally  incorporating  both  modellable  and  inherently  inconsistent 
aspects.  As  set  theory  has  been  well  realized  in  model  management  and  activities  in  system 
engineering,  categories  then  might  also  prove  to  assist  in  translating  open  aspects  in  socio- 
technical  systematic  programs. 


7.2.4  Discussion  and  Next  Steps 

As  was  discussed,  involvement  of  social  theory  brings  with  it  several  abstractions  that  have  to 
be  negotiated.  While  there  are  clear  physical  objects  involved,  measuring,  compressing  and 
extending  theory  then  involves  aspects  which  are  not  physically  instantiated.  People's  ability  to 
order  and  generate  language  are  underlying  reasons  for  these  difficulties.  This  has  prompted 
social  science  to  involve  additional  aspects  in  their  studies,  and  these  modalities  are  different 
from  traditional  technical  ones.  However  this  does  mean  that  when  involving  a  space  which 
covers  social  scientific  areas  that  it  must  inherit  these  modalities,  and  from  analysis  on  the 
modalities,  these  become  a  complex  system. 

The  theory  encapsulation  and  the  associated  modeling  involved  must  be  able  to  deal  with 
ordinals  from  human  activity  and  abstraction  due  to  language  dynamics.  These  aspects  do  not 
immediately  invalidate  any  model,  but  adds  dimensions  to  its  satisfiability  and  validity.  While 
rationalization  helps  in  this  regard,  there  are  potential  inconsistencies  in  substituting 
rationalization  with  analysis  and  analyzing  over  a  centralizing  rationale.  This  involves  any 
system  self-analyzing  its  own  inconsistencies  and  monitoring  the  ordinals  induced,  and  this 
requires  constructing  a  complete  system  against  any  context  to  be  able  to  be  validated  against 
other  'constructions'.  The  review  in  social  theory  is  that  while  the  expanse  becomes  difficult 
there  is  plenty  of  constructible  observables  to  use. 

Identification  of  a  system  over  ordinals  and  open  linguistics  as  needed  in  this  regard.  The 
theorems  in  abstract  mathematics  gives  a  formal  representation  for  these  organizations  of 
constructs.  These  mathematical  abstractions  and  modalities  can  be  identified  up  to  the  notion 
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of  a  'nomological  network'  that  underlies  latent  variable  theory  and  other  social  theory.  This 
should  allow  a  presentation  of  'constructs'  that  help  give  an  abstract  modularity  within 
enterprises.  The  challenge  then  is  maintaining  a  representational  language  for  a  system 
powerful  enough  to  capture  the  formal  abstractions  involved.  Apparent  in  social  theory  is  the 
varied  use  over  deductive  and  inductive  procedures  which  then  require  an  increasing  program 
to  incorporate  the  phenomena  involved.  The  modalities  in  axiomatic  sets  and  homotopic  types 
within  category  theory  allows  means  for  these  compressions. 

A  system  with  ordinal  constructs  then  could  be  identified  by  small  category  representation, 
large  category  extensions,  and  tracing  via  transformations  therein.  The  presentation  did  this 
with  ZFC  axiomatic  set  for  the  small  categories  and  NGB  axiomatic  set  for  the  large  categories. 
Then  the  tracing  exchange  was  around  the  instanced  architecture  (i.e.  common  transfinite 
hierarchy),  and  the  extensions  could  be  made  for  model  composition  around  a  'construct'  (i.e.  a 
set  large  category).  This  allows  one  the  freedom  to  be  able  to  represent  a  multitude  of 
potential  languages  across  various  orderings,  but  allows  some  abstract  spacing  as  these  must 
be  noted  with  their  categories,  morphisms,  and  ordinal  numbering  involved.  Any  systems 
surrounded  by  context,  constraint,  and  formality  could  then  use  a  Turing  (-like)  logical  program 
to  develop  any  instanced  enterprise  system. 

To  investigate  this  system  going  forward  will  involve  a  broad  and  interdisciplinary  aspects  that 
are  traditional  to  systems  engineering.  To  simplify  the  discussion,  the  three  aspects  of  the 
system  serve  as  good  initial  types.  The  small  category  considerations  are  scripted  as  traditional 
modeling  efforts  so  is  a  natural  setting.  The  large  category  considerations  surround  current 
social  science  efforts  as  one  imagines  the  involvement  in  open  categories  will  require  this  in  the 
study,  and  again  the  impression  is  the  logic  in  the  social  sciences  mirrors  this.  This  then  gives 
colloquial  categories  for  'technical'  and  'social'  respectively.  Then  system  engineering  efforts 
could  be  placed  in  investigating  the  research  in  those  morphisms  that  serve  applications.  And 
of  course,  underlying  all  of  this  will  be  the  abstraction  language  undergirding  this  from  the 
mathematical  sciences.  This  likely  would  involve  a  tripartite  effort  in  case  creation  with  a 
science  investigating  the  implications  of  maintaining  the  inconsistent  aspects  in  the  dynamics 
on  the  ordinal  placements. 

As  an  initial  area  for  theorem  discussion.  Arrow's  classic  inconsistency  theorem  provides  a  good 
candidate  as  it  involves  a  'linguistic  limit'  in  the  organization  of  decision  making.  Decision 
theory  itself  is  an  important  discipline  in  enterprise  systems  and  provides  a  common  thread. 
The  inconsistency  surrounding  decision  and  voting  systems  provides  a  large  category  with  limit, 
and  the  extensions  that  have  been  developed  provide  smaller  categorical  'transfer'.  Likewise 
its  ability  to  form  another  general  construction  in  general  equilibrium  theory  would  provide 
another  'pattern'  categorically.  A  categorical  review  on  the  model  management  aspects  in 
these  areas  might  provide  insight  in  patterns  in  these  ordinal  'public  choice'  constructs.  The 
basis  in  theorem  and  rational  models  would  provide  examples  of  viable  extensions  as  since  has 
identified  tractable  subcategories  from  Arrow's  original  system  (Reny  2001). 
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Conversely  another  area  for  the  analysis  models  might  be  health  systems  as  behavioral  health  is 
a  complexity  aspect  in  this  area  (Rouse  2008).  A  significant  portion  on  the  latent  variable 
theory  involved  health  behavior,  and  areas  of  community  health  immediately  involve  social 
psychological  aspects.  This  creates  a  wealth  of  ordinal  aspects  with  a  technical  underpinning 
(i.e.  chemical,  biological,  and  economic  aspects  in  healthcare).  This  has  been  an  important  area 
for  systems  and  enterprise  engineering,  and  would  by  argument  involve  several  categorical 
theoretic  problems.  Another  similar  area  for  analysis  might  be  financial  systems  as 
econometrics  and  stochastic  analysis  provide  categorically  different  measure  theory  set-ups. 

So  then  showing  the  spacing  aspects  in  ordinals  as  a  potential  means  for  splitting  the  aspects 
over  these  measure  theory  might  provide  clear  rational  deduction  on  the  descriptive  power. 

Overall  there  were  many  aspects  found  in  research  and  enterprise  activities.  The  important 
question  is  does  the  construct  language  appear  this  way  under  our  rationalization  or  is  this  an 
actual  underlying  encapsulation?  It  would  be  pertinent  to  assess  compression  schemes 
involving  language  in  a  progressive  manner  to  use  the  time  and  dynamics  as  an  extra  dimension 
for  validation.  This  is  a  heuristic  in  social  theory  and  likely  then  an  important  aspect  herein. 
Additionally  there  is  likely  a  multitude  of  existing  modeling  threads  for  category  theory  to 
encapsulate,  so  there  is  no  immediate  need  to  assess  actual  systems  before  investigating  its 
possibility.  This  can  be  assessed  by  organizing  system  architectures  and  description  aids  via 
these  categorical  notions  and  induce  the  various  valid  aspects.  Then  having  provided  the  'there 
exists'  portion  can  begin  to  inject  in  systems  to  assess  how  this  works  as  a  universal  capture. 

The  wealth  in  documentation,  case  examples,  models,  and  theory  should  provide  many  places 
to  start. 

Lastly  for  the  aspects  involving  enterprise  and  socio-technical  systems,  this  leaves  a  wealth  of 
space  in  which  to  investigate  but  also  questions  of  which  to  be  mindful.  Immediate  areas  to 
provide  valuable  input  is  helping  modelers  navigate  the  modal  aspects  involved  in  various 
theory  and  models.  How  much  can  be  identifying  as  pre-scripted  categories  from  mathematics 
and  how  much  will  be  observing  systems  with  categorical  aspects?  This  is  a  subtle  difference, 
but  from  construct  validity  the  directional  ordering  matters  in  these  systems. 

This  would  inevitably  be  an  activity  allowing  system  engineers  the  ability  to  document  domain 
languages  and  theory  for  their  linguistic  and  ordinal  aspects.  This  would  allow  initially  the 
ability  to  capture  domains  with  minimal  injection  yet  still  providing  a  library  and  composition 
methods  for  these  model  domains.  This  is  already  a  common  activity  for  systems  engineers, 
and  the  additional  aspect  will  be  maintaining  not  only  'bodies  of  knowledge'  but  the 
'repertoire'  of  categorical  transformations.  Hopefully  this  could  be  a  defined  'library'  for  socio- 
technical  systems  that  provide  domain  model  theory  by  indexing  ordinal  number,  language 
constraint,  and  construct.  How  might  this  be  accomplished? 

Another  point  of  usage  from  systems  engineering  would  be  to  expand  on  the  activity  of 
tradespace  construction.  Involving  ordinal  constructs  then  leads  to  the  question  on  how  does 
one  choice  and  trade  over  these  aspects?  This  'ordinal  tradespace'  would  be  a  new  concept 
and  difficult  to  define.  How  does  one  do  this  a  structured  manner?  As  it  is  an  incompleteness 
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measure,  is  there  even  a  semi-numerical  way  this  can  be  represented  or  accomplished?  There 
is  a  certain  paradox  in  discussing  how  to  structure  in  a  space  assuming  incomplete  structures. 
As  other  areas  have  identified  impossibility,  incompleteness,  and  paradoxical  aspects,  the 
optimistic  view  is  that  this  has  challenged  their  disciplines  to  expand  and  provide  strong 
explanations,  methods,  and  theory  respectively.  Socio-technical  systems  are  powerful  notions 
given  their  construction,  and  one  must  not  expect  simple  explanations  but  rather  the  objective 
is  finding  simple  compressions  across  this  domain. 


7.3  Future  Research 

Through  the  course  of  developing  the  approach  documented  in  this  section,  a  number  of  future 
research  topics  were  identified  that  would  allow  for  further  refinement  and  improvement  of 
enterprise  modeling  for  the  detection  of  unintended  consequences.  These  research  topics 
resulted  from  the  major  challenges  encountered  and  described  earlier  in  this  report.  In 
particular,  for  a  truly  robust  and  practical  modeling  approach  one  needs  to  have  an  inventory  of 
potential  models  or  theories  that  are  relatively  easy  to  navigate,  select,  and  compose.  The 
problem  is  that  theories  and  models  were  never  developed  or  organized  for  this  purpose. 
Consequently,  the  following  research  topics  were  identified  to  facilitate  this.  It  should  be  noted 
that  it  is  expected  that  addressing  these  topics  will  be  a  long-term  effort  involving  many 
researchers  from  many  different  disciplines. 

1.  Develop  a  theory  for  partitioning  models  for  reuse 

Most  models  are  developed  for  a  specific  purpose.  Consequently,  no  effort  is  made  to  minimize 
the  possibility  of  transition  linkages  with  other  potentially  related  models.  Is  there  a  way,  at 
least  within  a  particular  disciplinary  area,  to  partition  conceptual  models  in  such  a  way  to 
minimize  transition  linkages  across  developed  models?  If  this  could  be  done,  model  reuse  and 
composition  could  be  greatly  facilitated. 

2.  Develop  an  organizational  scheme  for  ", refactored "  models  and  theories 

Assuming  that  one  could  refactor  models  and  theories  as  described  in  topic  1,  how  should  they 
be  organized?  Some  models  and  theories  will  be  complements,  and  others  will  be  substitutes. 
Furthermore,  different  theories  and  models  will  exist  at  different  layers  of  abstraction.  Are 
there  systematic  ways  to  make  the  relationships  among  the  refactored  models  and  theories 
explicit  and  organized  to  facilitate  search  and  composition? 

3.  Develop  pasting  rules  for  imperfect  combinations  of  models 

As  discussed  previously,  it  is  unlikely  that  even  related  models  can  be  refactored  such  that  all 
potential  transition  linkages  are  eliminated.  When  these  occur,  we  will  find  ourselves  in  either 
case  2  or  case  3.  Currently  "handshake  algorithms"  or  pasting  functions  are  largely  developed 
on  an  ad  hoc  basis.  Are  there  any  ground  rules  or  heuristics  that  could  accelerate  the 
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development  of  these  when  necessary?  If  an  experiment  is  required  every  time  one  wants  to 
develop  a  new  handshake,  the  utility  of  these  models  for  practical  policy  analysis  will  diminish 
rapidly. 

4.  Develop  a  system  for  exploring  variations  on  models 

As  noted  previously,  if  one  really  were  to  build  an  inventory  of  composable  models,  it  is  unlikely 
that  one  would  be  able  to  evaluate  every  possible  permutation  for  many  policy  questions.  So, 
how  should  one  select  which  variations  to  evaluate  in  such  a  large  space?  Could  one  develop 
distance  metrics  to  guide  the  equivalent  of  a  sensitivity  analysis  over  model  structure? 

5.  Develop  a  language  for  specifying  model  needs  that  logically  determines  the  model 
composition  and  data  integration  scheme 

When  one  transitions  from  the  unrestricted  conceptual  model  of  the  system  to  the  more 
structured  decomposable  model  structure  (Figure  14c),  how  should  one  express  that  in  order  to 
facilitate  model  selection,  composition,  and  experimentation?  Can  an  existing  formal  language 
serve  this  purpose  or  must  a  new  one  be  developed? 

6.  Integrate  uncertainty  quantification  into  the  enterprise  modeling  approach 

The  ideal  output  of  an  enterprise  policy  analysis  would  be  a  probability  distribution  of 
outcomes  (Figure  13d)  rather  than  a  set  of  scenario  trajectories.  In  principle,  this  can  be 
accomplished  using  Bayesian  approaches,  but  there  are  likely  to  be  computational  issues. 
Consequently,  uncertainty  quantification  techniques  may  need  to  be  modified  to  accommodate 
the  proposed  system  of  varying  model  structure. 

7.  Develop  an  approach  to  integrate  qualitative  social  science  models  into  the  model  integration 
approach 

Many  social  science  theories  are  qualitative  in  nature,  and  it  is  not  clear  how  they  would  be 
instantiated  in  a  computational  model.  Flowever,  in  abstract  sense,  they  are  still  models,  and 
should  be  mathematically  representable.  Is  there  a  systematic  way  to  integrate  qualitative 
social  science  theories  into  a  computational  enterprise  model?  Category  theory  may  play  a  role 
here. 


8  Revised  Enterprise  Modeling  Methodology 


Based  on  the  case  studies  and  the  analysis  presented  in  this  and  previous  reports,  we  propose 
several  modifications  to  the  original  ten-step  enterprise  modeling  methodology.  Note  that 
even  in  the  original  presentation  of  the  methodology,  it  was  not  expected  that  it  would  be 
followed  in  an  exact,  sequential  fashion.  It  was  expected  that  in  some  cases  not  all  of  the  steps 
would  be  necessary.  In  others,  one  may  iterate  through  the  steps  several  times.  Consequently, 
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these  revisions  may  be  viewed  as  refinement  of  the  same  basic  ideas  that  is  based  on  a 
combination  of  experienced  variations  in  enterprise  model  development  as  well  as  insights 
derived  from  the  analysis  of  the  literature  and  the  theoretical  analysis.  The  most  important 
changes  are  a  shift  away  from  the  development  of  a  single,  integrated  computational  instance 
to  family  of  computational  instances,  and  the  introduction  of  three  phases  to  manage  the 
development,  analysis,  and  communication  of  the  family  of  computational  instances. 

The  purpose  of  the  phases  is  to  both  better  manage  the  issues  that  arise  from  the  integration  of 
multi-scale  ontologies  as  well  as  to  systematically  generate  "counter-intuitive"  results  and 
"unintended  consequences."  To  that  end,  the  first  phase  involves  laying  out  the  key 
phenomena  and  developing  a  validated  model  of  the  "business  as  usual"  case.  We  call  this  the 
core  model,  and  it  is  the  baseline  against  which  we  introduce  model  variations  to  identify 
unintended  consequences.  The  second  phase  formally  introduces  variations  as  peripheral 
models  that  can  be  connected  to  or  inserted  into  the  core  model.  These  are  used  to  generate  a 
set  of  possible  scenarios  that  could  be  sources  of  unintended  consequences.  These  potential 
scenarios  are  then  evaluated  for  validity  using  a  combination  of  data,  experimentation,  or 
subject  matter  expert  (SME)  review  as  appropriate.  This  allows  a  potentially  large  set  of 
scenarios  to  be  pared  down  to  just  those  that  are  likely  to  be  of  concern  to  policymakers  and 
stakeholders.  The  third  phase  focuses  on  communicating  these  key  scenarios  to  policymakers 
and  stakeholders.  Interactive  visualizations  are  developed  to  communicate  the  consequences 
of  the  key  scenarios  and  the  computational  models  are  updated  and  integrated  as  necessary  to 
support  the  interactive  exploration  of  the  scenarios.  Finally,  the  findings  are  communicated  via 
a  group  session  where  stakeholders  and  policymakers  can  interact  with  the  simulation  and 
visualizations. 

Below  are  the  detailed  steps  of  the  revised  methodology.  It  should  be  noted  that  all  ten  of  the 
original  steps  are  present  in  some  form.  For  each  of  the  revised  steps,  linkages  to  the  original 
ten  steps  are  indicated  by  the  step  number  in  parentheses.  To  highlights  the  changes,  new  or 
modified  steps  are  described  using  italicized  text.  As  with  the  previous  version  of  the 
methodology,  it  is  not  expected  at  all  applications  will  involve  an  exact  implementation  of  the 
steps.  Rather  they  serve  as  general  guidance.  Furthermore,  we  have  included  recommended 
participants  for  each  of  the  steps.  These  participants  include: 

•  Policymakers  are  those  in  leadership  positions  that  make  final  decisions  regarding  the 
selection  of  policy  options  and  are  held  accountable  for  the  resulting  outcomes.  These 
participants  are  often  not  concerned  with  the  low-level  details  of  the  model  but  want  to 
trust  it. 

•  Decision  makers  are  those  in  managerial  positions  below  the  policymakers.  These 
mangers  may  be  more  detail  oriented  that  the  policymakers  and  want  to  verify  the 
details  of  the  model.  Enterprise  modeling  efforts  for  small  organizations  may  not  include 
these  participants. 

•  Subject  Matter  Experts  (SMEs)  are  those  with  detailed  knowledge  and  experience  in 
specific  aspects  of  the  enterprise.  Given  that  the  enterprise  modeling  methodology 
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deliberately  attempts  to  capture  the  enterprise  from  multiple  perspectives,  it  is  likely 
that  the  effort  will  involve  subject  matter  experts  from  a  diversity  of  backgrounds.  These 
SMEs  are  often  selected  by  the  policymakers  and/or  decision  makers.  However,  it  is  also 
important  for  the  modelers  to  identify  and  engage  SMEs  on  their  own,  when 
appropriate,  to  provide  perspectives  that  the  policymakers  may  not  have  considered. 

•  Modelers  are  those  that  are  employing  this  enterprise  modeling  methodology.  They 
coordinate  participant  interactions,  collect  data  and  information,  set  up  the 
experimental  design,  develop  the  models,  and  perform  the  analysis. 

•  Stakeholders  are  those  that  may  not  have  the  authority  to  make  policy  but  will  be 
affected  by  consequences  of  any  policy  choices.  Their  buy-in  is  often  required  to  make  a 
policy  effective. 

1.  Phase  1  -  Identify,  Model,  and  Validate  the  Core  Relationships 

1.1.  Decide  on  the  Central  Questions  of  Interest  (1) 

The  history  of  modeling  and  simulation  is  littered  with  failures  of  attempts  to  develop 
models  without  clear  intentions  in  mind.  Models  provide  means  to  answer  questions. 
Efforts  to  model  socio-technical  systems  are  often  motivated  by  decision  makers'  questions 
about  the  feasibility  and  efficacy  of  decisions  on  policy,  strategy,  operations,  etc.  The  first 
step  is  to  discuss  the  questions  of  interest  with  the  decision  maker(s),  define  what  they 
need  to  know  to  feel  that  the  questions  are  answered,  and  agree  on  key  variables  of 
interest. 

Recommended  participants:  Stakeholders,  Policymakers,  Decision  makers,  SMEs,  Modelers 

1.2.  Define  Key  Phenomena  Underlying  These  Questions  (2) 

The  next  step  involves  defining  the  key  phenomena  that  underlie  the  variables  associated 
with  the  questions  of  interest.  Phenomena  can  range  from  physical,  behavioral,  or 
organizational,  to  economic,  social  or  political.  Particularly  important  are  the  relationships 
that  link  variables  under  the  policymaker's  control  to  outcomes  of  interest.  Broad  classes  of 
phenomena  across  these  domains  include  continuous  and  discrete  flows,  manual  and 
automatic  control,  resource  allocation,  and  individual  and  collective  choice.  Mature 
domains  often  have  developed  standard  descriptions  of  relevant  phenomena. 

Recommended  participants:  Stakeholders,  Policymakers,  Decision  makers,  SMEs,  Modelers 

1.3.  Develop  One  or  More  Visualizations  of  Relationships  among  Phenomena  (3) 

Phenomena  can  often  be  described  in  terms  of  inputs,  processes,  and  outputs.  Often  the 
inputs  of  one  phenomenon  are  the  outputs  of  other  phenomena.  Common  variables 
among  phenomena  provide  a  basis  for  visualization  of  the  set  of  key  phenomena.  Common 
visualizations  methods  include  block  diagrams,  IDEF,  influence  diagrams,  and  systemigrams. 
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Recommended  participants:  SMEs,  Modelers 


1.4.  Determine  Key  Tradeoffs  That  Appear  to  Warrant  Deeper  Exploration  (4) 

The  visualizations  resulting  from  Step  1.3  often  provide  the  basis  for  in-depth  discussions 
and  debates  among  members  of  the  modeling  team  as  well  as  the  sponsors  of  the  effort, 
which  hopefully  includes  the  decision  makers  who  intend  to  use  the  results  of  the  modeling 
effort  to  inform  their  decisions.  Lines  of  reasoning,  perhaps  only  qualitative,  are  often 
verbalized  that  provides  the  means  for  immediate  resolution  of  some  issues,  as  well  as 
dismissal  of  some  issues  that  no  longer  seem  to  matter.  New  issues  may,  of  course,  also 
arise. 

Recommended  participants:  Stakeholders,  Policymakers,  Decision  makers,  SMEs,  Modelers 

1.5.  Organize  Phenomena  into  Core  and  Peripheral  Groups  (5) 

Based  on  the  key  tradeoffs  determined  in  step  1.4,  we  identify  the  control  and  response 
variables  related  to  those  tradeoffs  and  identify  the  relevant  paths  between  them  using  the 
visualizations  developed  in  step  1.3.  These  paths  are  candidates  for  establishing  the  core 
model.  At  a  minimum,  the  core  model  must  include  at  least  one  path  from  control  to 
response,  but  multiple  may  be  included.  Presumably,  the  core  will  include  what  are 
perceived  to  be  the  most  "important"  factors.  In  essence,  this  would  be  the  "first  order" 
representation  of  the  system.  The  phenomena  that  were  omitted  from  the  core  are 
candidates  to  become  peripheral  models.  Note  that  sometimes  a  peripheral  model  will  be  an 
alternative  formulation  of  phenomena  included  in  the  core.  This  is  of  particular  concern  for 
behavioral  and  social  factors  where  there  may  be  alternative  theories  derived  from  different 
"schools." 

Recommended  participants:  SMEs,  Modelers 

1.6.  Assess  Types  of  Linkages  among  Phenomena  (6) 

Section  6  highlighted  the  different  types  of  potential  relationships  that  can  occur  among 
phenomena  and  associated  approaches  for  capturing  these  computationally  when  dealing 
with  multiple  ontologies.  Conseguently,  before  designing  any  computational  models,  it  is 
necessary  to  identify  any  transition  linkages  that  may  inhibit  implementation. 

Recommended  participants:  SMEs,  Modelers 

1.7.  Refactor  to  Create  an  Internally  Consistent  Core  Model  (7) 

To  provide  a  stable  baseline  for  policy  analysis,  the  core  model  must  be  internally  consistent 
and  generate  accurate  projections  of  the  consequences  of  policy  options  to  a  first  order. 
Consequently,  the  phenomena  in  the  core  model  may  need  to  be  refactored  using  a 
combination  of  approaches  described  in  Sections  6  and  7  to  support  the  implementation  of  a 
computational  model.  The  goal  is  to  eliminate  or  account  for  any  latent  transition  linkages 
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among  phenomena  captured  using  different  ontologies.  This  may  involve  both  substituting 
representations  and  introducing  " handshakes 

Recommended  participants:  Modelers 

1.8.  Architect  Simulation  Based  on  Linkages  among  the  Core  and  Peripheral  Groups 
(5,6,7) 

The  implementation  decisions  of  the  core  model  may  impact  how  the  peripheral  models  will 
be  connected  or  inserted.  Consequently,  it  is  advisable  to  develop  a  simulation  architecture 
to  guide  to  the  development  of  both  the  core  model  and  any  peripheral  models.  Otherwise,  a 
particular  computational  implementation  of  the  core  may  preclude  the  introduction  of  a 
particular  peripheral  model.  Completion  of  this  step  may  require  iteration  with  step  1.7. 

Recommended  participants:  Modelers 

1.9.  Identify  Data  Sets  to  Parameterize  the  Core  Model  (8) 

The  set  of  representations  chosen  and  refined  in  step  1.8  will  have  parameters  such  as 
transition  probabilities,  time  constants,  and  decay  rates  that  have  to  be  estimated  using 
data  from  the  domain(s)  in  which  the  questions  of  interest  are  to  be  addressed.  Data 
sources  need  to  be  identified  and  conditions  under  which  these  data  were  collected 
determined.  Estimation  methods  need  to  be  chosen,  and  in  some  cases  developed,  to 
provide  unbiased  estimates  of  model  parameters.  The  emphasis  in  this  phase  is  on 
parameterizing  and  calibrating  the  core  model  to  ensure  that  it  is  consistent  with  available 
data  and  SME  expectations. 

Recommended  participants:  SMEs,  Modelers 

1.10.  Program  and  Verify  the  core  model  (9) 

To  the  extent  possible,  this  step  is  best  accomplished  with  commercially  available  software 
tools.  The  prototyping  and  debugging  capabilities  of  such  tools  are  often  well  worth  the 
price.  A  variant  of  this  proposal  is  to  use  commercial  tools  to  prototype  and  refine  the 
overall  model.  Once  the  design  of  the  model  is  fixed,  one  can  then  develop  custom 
software  for  production  runs.  The  versions  in  the  commercial  tools  can  then  be  used  to 
verify  the  custom  code.  In  this  step  we  are  less  concerned  with  interface  development  and 
more  concerned  with  generating  accurate  results  and  supporting  subsequent  analysis  using 
the  peripheral  models. 

Recommended  participants:  Modelers 

1.11.  Validate  Core  Model  Predictions  at  Least  against  Baseline  Data  (10) 

The  core  model  is  validated  by  using  it  to  predict  current  performance  with  the  "as  is" 
policies,  strategies,  etc.  Empirical  data  is  ideal,  but  in  low  data  environments,  SME  review 
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may  suffice.  Here  the  objective  is  not  to  generate  " what  if"  scenarios,  but  rather  to 
demonstrate  that  the  model  is  able  to  capture  what  is  already  known  and  understood. 

Recommended  participants:  SMEs,  Modelers,  Decision  makers 

2.  Phase  2  -  Introduce  and  Model  Peripheral  Relationships  to  Generate  Scenarios 

2.1.  Organize  Peripheral  Groups  Into  an  Experimental  Design 

While  analysts  and  modelers  have  experimented  using  alternative  model  structure  in  the 
past,  the  goal  here  is  to  conduct  the  analysis  in  a  systematic  way.  Consequently,  a  plan 
should  be  developed  to  introduce  peripheral  models  to  the  core  and  capture  at  minimum  the 
qualitative  changes  in  the  in  the  predicated  outcomes.  Confounding  of  results  should  be  kept 
to  a  minimum.  While  the  approach  described  in  Section  7  is  minimally  sufficient,  a  model 
repository  as  described  in  the  RT-110  report  (Pennock  et  al  2015)  and  progress  against  the 
described  future  research  topics  could  greatly  facilitate  this  step. 

Recommended  participants:  SMEs,  Modelers 

2.2.  Identify  Data  Sets  to  Parameterize  Peripheral  Models  as  Appropriate  (8) 

Similar  to  Step  1.9,  the  peripheral  models  will  require  some  parametrization.  Depending  on 
the  experimental  design,  additional  data  may  or  may  not  be  required.  For  certain  "what  if" 
experiments,  it  may  be  desirable  to  explore  circumstances  which  have  never  been 
experienced.  Other  experiments  may  involve  an  alternative  formulation  of  an  aspect  of  the 
core.  In  those  cases,  it  may  be  necessary  to  parametrize  the  using  the  same  data  as  the  core. 

Recommended  participants:  SMEs,  Modelers 

2.3.  Program  and  Verify  the  Peripheral-Core  Variations  to  Support  Experimental  Design  (9) 

Often  this  step  will  involve  the  same  tools  used  in  step  1.10.  However  this  is  not  strictly 
required.  For  example,  in  some  circumstances,  a  peripheral  model  may  be  implemented  in  a 
specialized  tool  and  then  key  outputs  are  communicated  to  the  core.  This  will  be  heavily 
dependent  on  the  nature  of  the  peripheral  models  and  the  types  of  linkage  relationships 
between  the  peripheral  model  and  the  core.  /Is  with  step  1.10,  we  are  less  concerned  with 
interface  development  and  more  concerned  with  generating  a  wide  range  of  potentially 
useful  scenarios. 

Recommended  participants:  Modelers 

2.4.  Generate  Scenarios  According  to  Experimental  Design 

This  step  executes  the  production  runs  to  satisfy  the  experimental  design  from  step  1.2. 
Depending  on  the  design,  the  number  of  scenarios  may  be  substantial.  Results  should  not  be 
filtered  or  validated  at  this  point.  Due  to  uncertainties  in  model  structure  and  parameters,  it 
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may  only  be  possible  to  identify  qualitative  differences  from  the  predictions  of  the  core 
model.  At  this  point,  we  are  less  concerned  with  predictive  accuracy  and  more  concerned 
with  not  missing  a  potential  "unintended  consequence." 

Recommended  participants:  Modelers 

2.5.  Validate  and  Trim  Scenario  Set  using  Data,  Experiments,  and  SME  Review 

It  is  likely  that  the  experimental  design  will  generate  a  very  large  number  of  potential 
scenarios.  However,  not  all  may  be  relevant.  Some  may  not  be  significantly  different  from 
the  predictions  of  core  models.  Others  may  be  infeasible  due  to  factors  excluded  from  the 
model.  SME  review  could  be  useful  to  weed  these  out.  Finally,  when  possible,  scenarios  could 
be  tested  using  experiments  or  by  collecting  additional  data.  It  is  important  to  note,  though, 
that  given  the  objective  is  to  detect  unintended  consequences,  a  high  threshold  should  be  set 
to  reject  a  scenario.  When  in  doubt,  it  may  be  better  to  retain  a  scenario  that  is  significantly 
different  from  the  core  projections.  Even  if  the  projection  is  not  entirely  accurate,  its 
existence  may  be  informative  to  policymakers  and  stakeholders. 

Recommended  participants:  SMEs,  Modelers,  Decision  makers 

3.  Phase  3  -  Communicate  with  Stakeholders  via  Interactive  Interface  and  Visualizations 

3.1.  Identify  Scenarios  That  Appear  to  Warrant  Communication  to  Stakeholders  (4) 

Based  on  the  questions  of  interest,  key  tradeoffs,  and  the  outputs  of  Step  2.5,  identify  the 
scenarios  that  are  most  relevant  to  policy  stakeholders.  It  is  unlikely  that  modelers  will  have 
time  to  go  through  all  possible  scenarios  with  stakeholders,  at  least  in  the  first  session  (and 
this  may  actually  be  counterproductive).  Also  when  developing  interactive  interfaces  and 
visualizations  in  the  subsequent  steps,  there  is  often  a  tradeoff  between  the  number  of 
possible  variations  that  the  interfaces  accommodate  and  the  interpretability  of  those 
interfaces  by  the  stakeholders.  Instead,  highlight  those  that  are  both  relevant  to  the 
questions  of  interest  and  reveal  possibilities  that  may  be  unexpected  for  the  stakeholders.  Of 
course,  follow  up  discussions  may  lead  to  modifications  of  this  set. 

Recommended  participants:  SMEs,  Modelers,  Decision  makers 

3.2.  Develop  One  or  More  Visualizations  that  Explain  Relationships  among  Policy  Options 
and  Potential  Scenarios  (3) 

In  order  to  communicate  the  findings  to  stakeholders,  develop  visualizations  that  clearly 
explain  the  linkage  between  the  decision  variables  and  outcomes  interest.  Also  ensure  that 
these  visualizations  can  properly  discriminate  between  the  various  scenarios  that  are  being 
presented.  The  emphasis  is  on  communicating  the  key  scenarios  as  opposed  to  developing  a 
visualization  that  can  accommodate  all  possible  scenarios. 

Recommended  participants:  SMEs,  Modelers 
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3.3.  Selectively  Modify  or  Integrate  Computational  Instantiations  to  Generate 
Visualizations  (9) 

In  order  to  support  the  exploration  of  scenarios  using  the  interactive  visualization,  it  may  be 
necessary  to  modify  the  production  level  simulations.  In  some  cases  this  may  be  as  simple  as 
hiding  some  input  controls  to  highlight  the  most  relevant.  In  cases  where  the  simulation  is 
computationally  intensive,  it  may  be  necessary  to  summarize  a  large  number  of  runs  using 
response  surfaces,  fit  statistical  models,  etc. 

Recommended  participants:  Modelers 

3.4.  Develop  an  Interactive  Interface  to  allow  for  " on-the-fly "  exploration  of  scenarios 

This  step  involves  instantiating  interactive  visualizations  with  graphs,  charts,  sliders,  radio 
buttons,  etc.  Commercial  and  open  source  tools  may  be  useful  for  making  attractive  and 
easy  to  use  interfaces.  Interfaces  that  are  scalable  to  large  displays  and/or  touchscreens  are 
ideal  to  facilitate  group  interaction. 

Recommended  participants:  Modelers 

3.5.  Communicate  Findings  to  Stakeholders  via  Interactive  Visualizations 

In  order  to  communicate  the  results  of  the  analysis,  it  is  useful  to  let  policymakers  and 
stakeholders  directly  interact  with  the  visualizations  and/or  modified  simulation  tools. 
Ideally,  this  can  be  done  in  a  group  setting  where  interactions  with  the  visualization  trigger 
discussions  and  exploration  of  scenarios.  This  step  may  result  in  the  identification  of  "what 
if"  scenarios  that  cannot  be  accommodated  with  the  current  computational  instantiation. 
This  may  require  returning  to  earlier  steps  to  address  these  new  scenarios. 

Recommended  participants:  Stakeholders,  Policymakers,  Decision  makers,  SMEs,  Modelers 

9  Conclusions 


The  primary  objectives  of  RT-161  were  to  evaluate  the  core-peripheral  concept  against  a  new 
case  study,  develop  an  approach  to  deal  with  multi-scale  ontologies,  and  develop  an  approach 
to  systematically  identify  unintended  policy  consequences.  Based  on  the  satisfaction  of  these 
objectives,  the  enterprise  modeling  approach  was  to  be  updated.  The  net  result  of  completing 
this  work  was  a  substantial  revision  of  the  enterprise  modeling  methodology.  More  specifically, 
the  core-peripheral  approach  was  found  to  be  useful,  and  as  a  result,  it  was  explicitly 
incorporated  into  the  methodology.  Furthermore,  the  detailed  literature  review  and  a 
theoretical  investigation  led  to  an  approach  to  partition  an  enterprise  system  across  multi-scale 
ontologies  to  generate  the  core  and  peripheral  models.  The  peripheral  models  are 
systematically  introduced  via  an  experimental  design  to  identify  unintended  consequences  in  a 
justifiable  and  explainable  way.  Ultimately,  this  lead  to  the  reorganization  of  the  enterprise 
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modeling  methodology  into  three  major  phases.  Each  phase  contains  a  number  of  detailed 
steps  that  should  provide  additional  guidance  to  enterprise  analysts  above  and  beyond  what 
was  provided  by  previous  versions. 

Beyond  the  updates  to  the  methodology,  completion  of  RT-161  led  to  several  research 
conclusions: 

•  The  core-peripheral  approach  to  enterprise  analysis  is  promising,  but  will  require 
additional  test  cases  to  confirm  its  efficacy 

•  The  original  ten-steps  of  the  enterprise  modeling  methodology  required  additional 
refinement  and  expansion  to  both  support  the  proper  validation  of  the  model  and 
systematic  detection  of  unintended  policy  consequences.  This  led  to  a  reorganization  of 
the  ten  steps  into  three  phases.  The  first  phase  focuses  on  understanding  the  problem 
and  developing  and  validating  the  core.  The  second  phase  systematically  introduces  the 
peripheral  models  to  identify  unintended  policy  consequences,  and  the  third  phase 
communicates  key  findings  to  enterprise  stakeholders. 

•  The  problem  of  dealing  with  multi-scale  ontologies  for  discovery  does  not  appear  to 
have  a  general  solution  in  either  the  physical  sciences  or  the  social  sciences.  In  the 
physical  sciences,  multi-scale  models  tend  to  devolve  to  interpolation  as  opposed  to 
discovery.  In  the  social  sciences,  we  have  the  opposite  problem  as  the  number  of 
different  possible  explanations  of  a  phenomena  tend  to  proliferate  resulting  in 
sometimes  inconsistent  predictions  or  predictions  are  that  are  accurate  for  groups  but 
not  individuals.  Consequently,  there  is  a  continual  struggle  for  "construct  validity." 

•  If  one  is  going  to  successfully  predict  unintended  policy  outcomes  for  enterprise 
systems,  properly  organizing  and  leveraging  this  myriad  of  social  science 
representations  is  key.  Consequently,  there  are  important  research  questions  as  to  how 
to  go  about  this  in  a  mathematically  rigorous  way. 

•  Model  composition  problems  experienced  in  multi-scale  models  may  be  the  result  of 
latent  transition  linkages  among  the  different  models.  Identifying  and  managing  these 
linkages  through  proper  partitioning  schemes  may  be  the  key  to  facilitating  this  type  of 
analysis. 

•  One  possible  approach  for  describing,  organizing,  and  relating  diverse  system  models  is 
through  the  application  of  a  branch  of  mathematics  called  Category  Theory.  At  a 
minimum,  it  may  provide  a  language  to  rigorously  describe  the  problem,  but  much 
additional  research  is  required. 

These  conclusions  led  us  to  describe  several  potential  avenues  for  further  improvements  in  the 
enterprise  modeling  methodology.  Among  these  avenues  are  a  rigorous  approach  to 
partitioning  and  refactoring  models  for  reuse,  adapting  the  concept  of  a  "nomological  network" 
from  the  social  sciences  to  the  organization  of  candidate  models  for  use  in  an  enterprise 
analysis,  and  directly  integrating  uncertainty  quantification  approaches  into  the  enterprise 
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modeling  methodology.  However,  since  these  activities  were  outside  of  the  scope  of  this 
research  task,  they  must  be  relegated  to  future  work. 
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